One more histogram, possibly of general interest. Below is a plot showing the number of literary titles by American authors published in the U.S. each year between 1850 and 1875 (via Lyle Wright’s 1957 bibliography as represented in Indiana’s holdings, black bars) along with the number of those titles held in fully edited form in the Wright American Fiction archive from Indiana and in the MONK project.
Note that this isn’t a stacked bar plot; you’re seeing three distinct histograms superimposed on one another. So if you’re looking just at the black bars, you’re seeing a comprehensive survey of American literary production around the Civil War.
Publication of literary texts drops off in the run-up to the Civil War and in its early years, then bounces back pretty quickly, even before the war is over. There are about 100 new books each year on average through the period.
Two notes for my own purposes. (1.) IU’s coverage of fully edited texts is around 40% of the total period output. That’s pretty good. Just as importantly, it hits that level roughly evenly for each year. No need to worry about serious variations from year to year or about individual years with very low representation (though be careful with, e.g., 1860–61). and (2.) I like what MONK did with its 300-text subset, clustering texts as far on either side of the war as possible. Even if you were only working with MONK, you’d still have a decent chance of picking out ante-/post-bellum features.