New article in Cultural Analytics

Elizabeth Evans and I have a new article, “Nation, Ethnicity, and the Geography of British Fiction, 1880-1940,” out in Cultural Analytics.

It’s a lengthy, data-driven exploration of the literary geography of British writing during the modernist era, but it’s not really about modernism per se. We compare geographic usage in books by well-known writers to that in books by mass-market and colonial authors, and we spend some time trying to assess how British literary geography changed over six decades.

A few highlights:

  • Books by famous writers — including canonical modernists — became more international over time. But they lagged the international attention of mass fiction by a decent amount and that of foreign-born writers by a lot. If critics are interested in modernism in part because it was less domestic and provincial than Victorian lit, they could find even more striking examples elsewhere.
  • Foreign writers of color devoted much more of their London attention to parks, rivers, and other green spaces than did other groups of writers. No such effect for foreign-born white writers. A concrete effect of differential social access to domestic and commercial spaces?
  • London really dominated British domestic attention, even more than the population of the city would suggest. The U.S. case is different. We’d love to have comparable data for French lit.
  • There were significant differences in the specificity of geographic usage between groups of writers. Foreign-born authors used many more non-UK places, but they were more likely to favor higher levels of abstraction (“India” rather than “Delhi,” for example). We think this is tied to social and political content over against setting and characterization.
  • We don’t find evidence of any strong, across-the-board uptick in geographic use during the period. But it’s not hard to find select places that were used more intensively after, say, WWI, and therefore not hard to see how the impression of a geographic turn could have taken hold.
  • You can explore an interactive visualization of authors’ geographic similarity.

Here’s how we frame the intervention in modernist studies:

Our results lead us to three broad interventions in modernist literary studies. First, we argue that a modernist studies that values internationalism must devote significantly more attention to non-canonical literature. The mass run of fiction published between 1880 and 1940 was consistently and meaningfully more international than its better-known analogues. Writing by non-native British writers was radically more so. If critics are drawn to the outward turn in modernist texts, they can and should find a larger, earlier, and perhaps more important version of the phenomenon by looking beyond the usual suspects.

Second, we need to rethink London as it was encountered and described by outsiders. This isn’t just a matter of turning away from the famous and the posh in favor of the neglected and the downtrodden (though there are worse places to start). It’s about explaining, for instance, why foreign writers of color depict a more public, verdant London than their colony-born white counterparts, while devoting less of their attention to the East End and to notably international districts of the city. These patterns are either anecdotal or essentially invisible to conventional study. Computational methods make them available for nuanced literary-historical reinterpretation.

Finally, we argue against treating the years between 1880 and 1940 in terms that emphasize temporal discontinuity. Aspects of British fiction did change across this span of sixty years, and many of the differences we observe in the era’s literary-geographic attention are genuinely important. But when we work at scale, it’s very difficult to locate “on or about …” moments of sudden change across whole ranges of texts. We see instead situations of influence and drift or—and this is the rub—we find true ruptures only between corpora built around differing principles. The latter case, comparing corpora assembled to emphasize difference, is the one that resembles most closely the way in which modernist studies built its canons. Those canons and the practices they embed aren’t simply errors, but they are deliberately and systematically nonrepresentative of large-scale literary history. Modernist literary critics would do well to grapple with that fact more directly than we often have.

There’s a lot more in the article, and the underlying code and data are freely available. Check it out!

Ours thanks to the NovelTM group, where we first workshopped the paper; to Stephen Ross, who offered helpful feedback on the manuscript; and to the NEH, which supported our work via a grant to the Textual Geographies project.

Books I Read in 2017 (and 2016, and 2015)

It’s been … (checks calendar, hangs head) … three years since I last did one of these? I am slack. As ever, I enjoy hearing about what other people are reading, so I figure that the least I can do is share alike. Here’s the fiction (only) that’s kept me busy since the start of 2015. Other posts in this series go back to 2009. FYI, links are to Amazon, but aren’t affiliates.

  • Beatty, Paul. The Sellout (2015). Enjoyed this a lot, though I don’t think I could teach it.
  • Blixen, Karen. Out of Africa (1937). Not sure why I decided to close this particular gap this year.
  • Burroughs, William S. Naked Lunch (1959). Tested my patience.
  • Cole, Teju. Open City (2011). Wonder if I might pair this with Jenny Offill’s Dept. of Speculation (short, experimental, New York) the next time I do the contemporary U.S. fiction class.
  • Ferrante, Elena. My Brilliant Friend (2011). Wasn’t moved the way that others seem to have been.
  • Fruhlinger, Josh. The Enthusiast (2015). A friend’s first novel. Genuinely good.
  • Hawkins, Paula. The Girl on the Train (2015). Wasn’t gripped by it.
  • Heti, Sheila. How Should a Person Be? (2010). Serious question: when did “art monster” become a thing?
  • James, Marlon. A Brief History of Seven Killings (2014). Very good, but you don’t need me to tell you that.
  • July, Miranda. The First Bad Man (2015). Really liked this.
  • King, Lily. Euphoria (2014). I remember liking this, though not much else about it.
  • Klay, Phil. Redeployment (2014). Ditto.
  • Lethem, Jonathan. Amnesia Moon (1995). A lot of dystopias these last few years. Not crazy about that, though I am apparently a slow learner of my own tastes.
  • Mandel, Emily St John. Station Eleven (2014). See Lethem.
  • Mantel, Hilary. The Assassination of Margaret Thatcher (2014). One of two story collections (with Klay). I usually prefer novels. But, man, is Mantel good.
  • Marr, Andrew. Head of State (2014). Enjoyably trashy.
  • McCarthy, Cormac. The Road (2006). See Lethem.
  • McCarthy, Tom. Satin Island (2015). I preferred Remainder, but perfectly serviceable.
  • Murray, Paul. The Mark and the Void (2015). The metafiction eventually wore thin for me.
  • Nguyen, Viet Thanh. The Sympathizer (2015). About as good as everyone says.
  • Ortberg, Mallory. Texts from Jane Eyre (2014). God, I miss The Toast.
  • Price, Richard. Clockers (1992). Couple of decades late to this party. Sorry to have missed it for so long.
  • Pym, Barbara. Excellent Women (1952). Damn you, Mallory Ortberg. I will read literally anything you tell me to.
  • Robinson, Marilynne. Gilead (2004). The best thing I’ve read in … years? How many years? A decade, at least, I think.
  • Robinson, Marilynne. Housekeeping (1980). Not as good as Gilead, but still awfully good.
  • Saunders, George. Lincoln in the Bardo (2017). Technically interesting, left me a little flat. Surprising, given how much I enjoy his short fiction.
  • Stout, Rex. Fer-de-Lance (1934). See figure 5.
  • VanderMeer, Jeff. Annihilation (2014). Looking forward to his visit to Notre Dame this year.
  • Whitehead, Colson. The Underground Railroad (2016). Other than Zone One, the best thing of his since The Intuitionist.
  • Yamashita, Karen Tei. I Hotel (2010). Left me cold.

I tried Kindle samples of another couple of dozen novels. Some that I’d like to come back to eventually. FWIW, I do almost all my reading on a screen of one sort or another. No frickin’ deckle edge.

Numbers? 30 books in three years. Not going to win any awards. 13 by women, 17 by men. Mostly Americans, for professional reasons, though it’s a mug’s game to divvy them up in detail. None that I hated, maybe two or three that I read more from obligation than desire, and handful that were full-on great. The Robinson was the real standout.

First up in 2018 is Julian Gracq’s A Balcony in the Forest or Helen Phillips’s The Beautiful Bureaucrat. Or something else. You never know.

Visualizing geographic intensity

I’ve been thinking recently about how to visualize the distribution of geographic attention. There are a bunch of ways to do this, of course, no one of which is categorically correct. It depends on what you’re trying to capture. But a typical approach for me has been to use a bubblemap with marker areas corresponding to the number of times a specific location is mentioned in a collection of texts. A couple of examples:

Since the markers are transparent, you get a sense of the density as the fill color builds toward fully opaque.

There are other ways you might try to capture aspects of this problem. Heatmaps are an obvious idea, something like this cool (but sadly defunct?) map of world “touristy-ness” based on geotagged photos:


One thing that’s tricky about heatmaps is that, by convention, dark areas represent low values and light areas represent high values. That works well when layered over dark basemaps (as above), but less well over light ones:


The dark edges suggest a stark boundary where in fact there is none. And I really want to use light basemaps, since dark ones reproduce poorly in print.

Anyway, you can fight convention, of course, and use a light-to-dark colormap. I’ll show a version of that approach below. But I also came across a great post from Agile Scientific that discussed two other methods: contour lines and hillshading, both familiar to anyone who has used a topographic map. Here’s Agile’s result, showing the seabed off Nova Scotia:


Neat! OK, so how would it look with the London data from the very first map above? Here are a few possibilities:


Clockwise from top left: bubblemap, heatmap, heatmap with contour lines and hillshading, and bubblemap with contour lines (but no hillshading).

These visualizations emphasize different aspects of the data. The bubblemaps are good for identifying the specific locations in question. The heatmap alone highlights areas of highest density. The hillshaded map reminds me of the Z-Axis project, though tuned to be less “peaky” — which the Z-Axis tool also allows — and without the tricky basemap transformation.

maybe like the bubblemap-with-contours version best, since it allows you to see the specific locations and a representative density distribution at the same time. But I have to admit that, despite having spent a couple of days playing with this, I’m not finally convinced that I’ve found something broadly better for my purposes than the bubblemap with which I began. I worry in particular about mixing cartographic conventions; most people probably won’t be confused about London, but what about those who don’t know that city? Or somewhere less familiar? Do the contours add enough interpretive value to be worth the need to overcome their association with physical elevation? I suppose the answer depends on context.

In any case, this was a fun exercise. There’s code and sample data available on GitHub if you want to try it yourself or see how the maps were made. FWIW, these images depend on the standard Python data science stack (via Anaconda, in my case), plus the cartopy package for mapping operations and Mapbox tiles for the basemap. It would be interesting to translate the output format into something interactive, probably in Leaflet via Folium or something, which I’ve used in the past. A task for another day, though I’d be grateful to know how others deal with dual output intents. I’ve prioritized print with the static maps here, but it would be nice to flip a switch and get both.

Update: Speaking of good uses of heatmaps, have a look at Oxford’s RoadLess Forest project “global map of accessibility.” Shows travel times to the nearest city with a population over 50,000. Cool, pretty, interactive, and analytically useful.

Screen Shot 2018-01-16 at 9.35.15 PM

Cultural Analytics 2017 Wrap-Up

PosterThe 2017 Cultural Analytics Symposium took place at Notre Dame on May 26 and 27. A thousand thanks to everyone involved, from speakers and respondents to audience members, organizers, and funders. I had a great time, met new people, and learned a bunch.

There are now videos of the talks available from the symposium site, as well as a YouTube playlist of the full event. Quality is a little scratchy, but generally serviceable.

Finally, a few follow-ups from participants. Howard and Kenton Rambsy wrote a series of great blog posts about several of the talks (including a condensed version of their own), emphasizing the ways that computational methods might extend work in African American literary studies. And Ted Underwood posted a version of his talk (with a link to data and code) on his site. I’m sure there are others that I’ve missed, so if you’ve written about the event, drop me a line and I’ll add a link here.

Thanks again to one and all. I’m excited to get back to work!

Cultural Analytics Symposium at Notre Dame

On May 26 and 27, Notre Dame is hosting Cultural Analytics 2017, a symposium devoted to new research in the fields of computational and data-intensive cultural studies. Combining methods and insights from computer science and the quantitative social sciences with questions central to the interpretive humanities, the event explores some of the most compelling contemporary interdisciplinary work in a rigorous, collegial environment.

The symposium is free and open to the public. For details including registration, schedule, and the full lineup of intimidatingly great speakers, see the Cultural Analytics 2017 site. Hope to see you there!

NEH Grant for Textual Geographies Project

map-nations-allI’m pleased to announce that the Textual Geographies Project has been awarded a $325,000 Digital Humanities Implementation Grant from the National Endowment for the Humanities. I’m hugely grateful for the NEH’s generous support and for previous startup funding from the ACLS and from the Notre Dame Office of Research.

I’m excited to work with project partners at Notre Dame, at the HathiTrust Research Center, and around the world. The grant will support further development of a Web-based front end for the enormous amount of textual-geographic data that the project has already generated, as well as ongoing improvements to the data collection process, new research using that data, and several events to engage scholars and members of the public who are interested in geography, history, literature, and the algorithmic study of culture. I’ll also be hiring a project postdoc for the 2017-19 academic years.

More information on all these fronts in the months ahead!

Postdoc in Computational Textual Geography


Update: The position has been filled. I’m very pleased that Dan Sinykin will be joining our group next year as a postdoctoral fellow.

I’m seeking a postdoctoral fellow for a two-year appointment to work on aspects of the Textual Geographies project and to collaborate on research of mutual interest in my lab in the Department of English at Notre Dame.

The ideal candidate will have demonstrated expertise in literary or cultural studies, machine learning or natural language processing, and geographic or spatial analysis, as well as a willingness to work in new areas. The fellow will contribute to the ongoing work of the Textual Geographies project, an NEH-funded collaboration between literary scholars, historians, geographers, and computer scientists to map and analyze geographic references in more than ten million digitized volumes held by the HathiTrust Digital Library. Areas of current investigation include machine learning for toponym disambiguation, named entity recognition in book-length texts, visualization of uncertainty in geospatial data sets, and cultural and economic analysis of large-scale, multinational literary geography. We welcome applications from candidates whose research interests might expand the range of our existing projects, as well as from those whose expertise builds on our present strengths.

Interdisciplinary collaboration with other groups at Notre Dame is possible. The fellow will also have access to the Text Mining the Novel project, which has helped to underwrite the position.


Details and application via Interfolio (free). Letters not required for initial stage. Review begins immediately and continues until position is filled. Salary $50,000/year plus research stipend. Initial appointment for one year, renewable for a second year subject to satisfactory progress. Teaching possible but not required.



My book, Revolution: The Event in Postwar Fiction, is out with Johns Hopkins University Press. OK, it’s been out since October, but still. I’m really excited about it.

Below is the description from JHUP’s site, and I have a related post, “5 Things You Might Not Know About Fifties Fiction,” on their blog as well. In brief, the book is about how one set of literary and cultural forms displaces another, especially as that process played out in the United States after World War II. Want to know why fifties fiction is full of rambling allegories and why no one writes like Jack Kerouac? Or what those facts have to do with the French Revolution or the invention of quantum mechanics? You’ve come to the right place.

There’s a preview of the book available via Google. Should you be so inclined, you can buy the thing directly from the press, via Amazon, or wherever fine literary-critical monographs are sold. Want to review it? The press has your hook-up.

Here’s a fuller description of project:

Socially, politically, and artistically, the 1950s make up an odd interlude between the first half of the twentieth century — still tied to the problems and orders of the Victorian era and Gilded Age — and the pervasive transformations of the later sixties. In Revolution, Matthew Wilkens argues that postwar fiction functions as a fascinating model of revolutionary change. Uniting literary criticism, cultural analysis, political theory, and science studies, Revolution reimagines the years after World War II as at once distinct from the decades surrounding them and part of a larger-scale series of rare, revolutionary moments stretching across centuries.

Focusing on the odd mix of allegory, encyclopedism, and failure that characterizes fifties fiction, Wilkens examines a range of literature written during similar times of crisis, in the process engaging theoretical perspectives from Walter Benjamin and Fredric Jameson to Bruno Latour and Alain Badiou alongside readings of major novels by Ralph Ellison, William Gaddis, Doris Lessing, Jack Kerouac, Thomas Pynchon, and others.

Revolution links the forces that shaped postwar fiction to the dynamics of revolutionary events in other eras and social domains. Like physicists at the turn of the twentieth century or the French peasantry of 1789, midcentury writers confronted a world that did not fit their existing models. Pressed to adapt but lacking any obvious alternative, their work became sprawling and figurative, accumulating unrelated details and reusing older forms to ambiguous new ends. While the imperatives of the postmodern eventually gave order to this chaos, Wilkens explains that the same forces are again at work in today’s fracturing literary market.

As I say, I’m super happy to have the book out in the world. I owe thanks to many, many people for their help along the way. Now, on to the next one!

Computational Approaches to Genre in CA

fig5New year, catch-up news. I have an article in CA, the journal of cultural analytics, on computational approaches to genre detection in twentieth-century fiction. The piece came out back in November, but, well, it’s been a busy year.

The big finding — beyond what I happen to think is a nifty way of considering genre — is that certain highly canonical, male-authored novels of the mid-late twentieth century (by the likes of Updike, Bellow, Vonnegut, DeLillo, etc.) resemble one another about as closely as do mid-century hard-boiled detective stories. That is, very closely indeed. There are a couple of conclusions one might draw from this; my preferred interpretation is that the functional definition of literary fiction in the postwar period (and probably everywhere else) remains much too narrow. But there are other possibilities as well …

CA, by the way, has had some really great work of late. Andrew Piper’s article on “fictionality” is especially worth a read; Piper shows that it’s not just possible but really pretty easy to separate fiction from nonfiction using a basic set of lexical features.

Masterclass and Lecture at Edinburgh

I’m giving a two-and-a-half day masterclass on quantitative methods for humanities researchers at the University of Edinburgh, 19-21 September, 2016. There’s a rough syllabus available now, with more materials to be added as the event draws nearer. If you’re in Scotland and want to attend, there may be (literally) a place or two left; details at the Digital Humanities Network Scotland.

There will also be a public lecture on the evening of Wednesday, September 21, featuring a response and discussion with the ever-excellent Jonathan Hope (Strathclyde).

I’m grateful to Maria Filippakopoulou for organizing the visit and to the Edinburgh Fund of the University of Edinburgh for providing financial support.