With the kind assistance of several folks at Indiana, I’ve now gotten my hands on IU’s full holdings of the digitized Wright American Fiction collection. This is the literary corpus spanning 1850-1875 from which the MONK texts that I used for my initial mapping project were drawn. But MONK chose to limit the size of their Wright-based corpus to around 300 volumes for reasons of balance across their several datasets.
IU has an additional c. 900 Wright texts that have been fully edited and XML encoded (plus 1300 more that have been OCR’ed and XML encoded but not hand edited). This means my depth and temporal coverage in the period around the Civil War just got way better.
More info and results to come as I work my way through this stuff. In the meantime, here’s a plot of the temporal distribution by original publication date of the texts in the two corpora: