Elson et al., “Extracting Social Networks from Literary Fiction” (2010)

Just had the chance to read this intriguing paper on automated assessment of social networks in nineteenth-century British fiction, presented at this year’s ACL conference (and picked up on DH Now). I’m posting more for the link than anything else, but a couple of thoughts that are too long for Twitter …

The paper’s take-away point is that (British nineteenth-century) fiction set in an urban environment doesn’t seem to show the diffuse social networks vis. rural fiction that one might expect following Bakhtin and others. Social networks in urban fiction turn out to be about the same size as those in rural fiction, and the connections between urban characters are if anything more robust than those between rural characters. The theory of chronotopes is said to take something of a hit here, though it’s by no means overturned.

The social networks in question are measured by the quantity of direct discourse exchanged between any two characters in a text. The dialogue in question needs to be presented in quotes, the people speaking or being spoken about need to be named (in a way amenable to algorithmic named entity extraction), and there can’t be more than 300 words of non-dialogic exposition between entries in a single conversation. The authors also prune minor and fleeting characters from their networks in order to keep them manageable. The methodological details are pretty interesting; have a look at the paper for the full run-down.

This is compelling work and may be an important contribution to the way we think about urbanization in nineteenth-century fiction. There are a few tricky problems, though.

  1. Conversation seems like a pretty good proxy for social connectedness, but of course it’s a partial and imperfect gauge; there certainly could be others.
  2. The inability to detect and evaluate indirect discourse (in validation tests, the authors’ method missed about half of the relevant dialogic exchanges) might be especially important in urban settings. It’s possible (but by no means certain) that urban characters spend more time overhearing, summarizing, and recounting than speaking face-to-face. Or maybe urban novels emphasize indirect discourse as a means by which to convey some aspect of city life. The point is that there might be important differences between both the types of social networks presented through direct and indirect discourse and in the sheer quantity of indirect discourse in different types of fiction.
  3. And of course throwing out fleeting and minor characters, which might be expected to occur more often in urban settings, would tend to concentrate any measure of the resulting social network.

Anyway, I’m fascinated by the work and don’t mean to pick nits. The paper is well worth a read. I look forward to seeing more from the group in the future.

