Why I’m in Favor of the Google Book Search Settlement

When Google announced their book-scanning project five years ago, most academics I talked to about it were pretty happy. These days a lot of that enthusiasm seems, if not to have disappeared, then at least to have been tempered by serious doubts. I share some of these, but on the whole the settlement is a profoundly good thing. I support it, and I hope my colleagues will, too.

About the settlement

First, two notes: One on the underlying legal issue and one on what’s at stake. The publishers and authors (via the AAP and the Authors Guild, respectively) sued Google for alleged “massive copyright infringement” shortly after Google began scanning books from several prominent libraries. The theory is that because Google makes a copy of every book they scan, they require the rightsholders’ permission to assemble their book search database. Google says the process is covered by the fair use exception in copyright law and is no different from their Web search business, which also copies texts in order to index them. How this would be decided in court is unknown, mostly because the legal definition of fair use is extremely and deliberately vague.

But it’s clear that Google has a lot more to lose than do the publishers and authors if the case were to go against them. If the publishers were to lose, Google could index their stuff without further permission. But it’s hard to see how they’d be hurt by that, since it would only help people find books and wouldn’t change the strong basic copyright protections they already enjoy. Google still wouldn’t be able to sell or give away in-copyright books, for instance. Google, on the other hand, could be destroyed if they were to lose. They’d be on the hook for God knows how much in damages, of course (willful infringement of copyright carries maximum statutory damages of $150,000 per instance). But—and this is much more important—because there’s no fundamental difference between the copyright protections for Web pages and those for books, a decision in favor of the publishers would effectively outlaw search as it currently exists. Would a court dare do that? I have no idea, but Google obviously took the threat seriously enough to settle rather than to fight, especially since Web search is everything to them, whereas books are a comparative hobby. I wish Google had chosen to go to trial, because I think (and hope) they would have won, thereby clarifying and solidifying fair use rights in computational contexts, but it’s neither my money nor my business that’s at stake, and I understand why they chose to settle.

This dispute about fair use is interesting in its own right, but it’s not in itself the main objection to the settlement from most of my academic friends. (Most academics, though certainly not all, are in favor of more liberal fair use rights, and would therefore usually side with Google on copyright issues.) They’re concerned instead about a missed opportunity for real reform, and about the perceived market power the settlement would grant to Google and the rightsholders. How so? Not over works that are already in the public domain; these are free to copy and redistribute already, and there’s nothing in the settlement that would (or could) change that. Anyone else could create a competing database of public domain works (see the Open Content Alliance, for instance). And it’s not about current books, whether in or out of print, which the rightsholders are free to dispose of as they wish—they can be bought, sold, and licensed according to the whims of the publishers and authors. Again, nothing in the settlement could possibly change this, since to do so would involve rewriting American copyright law. The issue, then, is over so-called “orphan” works, books for which an appropriate rightsholder cannot be established or contacted.

Here’s how things stand now with respect to orphan works: They’re simply off limits for anything beyond ordinary fair use. They can’t be reissued, corrected, or adapted. You can’t assign them in a college course, because no one can produce a new edition and you can’t make copies of your own or your library’s (rare) copy. You can’t use an orphan sound or video clip in a new song or film. And, absent a real answer to the fair use question raised by Google’s scanning project, you can’t include them in a search tool, because you can’t get a rightsholder’s approval to do so. It would be an exaggeration to say orphan works may as well not exist—they still do sit in libraries and archives—but they’re a lot less useful than either public domain or current works.

The settlement would establish a “rights registry,” a clearinghouse tasked with identifying and tracking rightsholders (if any) and copyright status for all books. As a practical matter, “all” would mean “those scanned by Google,” at least at first. Google would pay $34.5 million to establish this registry, which would then operate as a non-profit and work on behalf of rightsholders, distributing whatever funds it collects to the appropriate parties. In exchange for setting up this registry and paying a chunk of cash ($125 million in all), the publishers and authors drop their copyright infringement claims (so Google can go on scanning). Maybe more importantly, as far as my uncomfortable academic friends are concerned, Google gets the right to scan, process, and sell orphan works, even though their proper rightsholders can’t be determined, and they get indemnity from lawsuits if they make honest mistakes about the copyright status of a work (and sell it or offer it for free when they shouldn’t, for instance). Rightsholders can opt out of this arrangement at any time, though of course they’ll then lose the benefits of being available through Google.

Some objections

This all looks pretty win-win. Google gets to do what they do, maybe opening up a big new market in the process, and they remove a significant legal cloud hanging over them. Publishers and authors get a pile of cash, a new outlet for their goods, and they get to sell a bunch of old stuff that’s currently out of print. Users win because they get a search and information resource that they wouldn’t otherwise have had.

The concern, though, is that Google is the only would-be scanner to benefit directly from the settlement. The settlement leaves unanswered the fair use question about book scanning. It leaves unchanged the status of orphan works, but allows Google alone (at least at first) to make use of them. And it gives two private, for-profit entities (the Authors Guild and the Association of American Publishers) control over the rights registry.

Wouldn’t it be better, these friends of mine say, to resolve these issues legislatively, so that the law would be clear and everyone would stand of level ground? Couldn’t we create a limited right to use orphan works, to store “non-consumptive” copies of texts for computational use, and set up a public rights registry? Wouldn’t that provide better and fairer competition in the marketplace? Absent those changes, don’t we risk creating a situation in which there are only two (cooperating) players (Google and the rightsholders) in the marketplace? Would any other company be able to negotiate an equivalent agreement with the rightsholders? Especially since those rightsholders wouldn’t have any incentive to help set up a competitive market for their products? Would any other company have the resources to scan millions of books, especially after Google has a head start on both the technical and the business sides? Isn’t this our one big chance to get scanning done right? Aren’t we missing a great opportunity to reform a badly out-of-whack U.S. copyright regime? And won’t libraries be almost required by their patrons to subscribe to Google’s digital products, available at only monopoly prices?

My answers

I share many of these concerns. But I still think we’ll be much, much better off with the settlement than without it. Here’s why:

Copyright reform

We do need copyright reform, including provisions for orphan works. But I don’t think we’ll ever get it, especially in the absence of the settlement. When has Congress ever scaled back any part of copyright protection? Is there any reason to think it will do so now or in the foreseeable future? Even if it were to, how long do you think we’ll have to wait for it, given our current political priorities, making no progress on things like book search and computational analysis rights in the interim?

Our current copyright regime—which allows for effectively endless copyright protection without any provision for an evolving public domain—is totally out of alignment with the social cost/benefit analysis that authorizes U.S. copyright law. I don’t think there’s any chance that’s going to change, but if the settlement is approved, there will at least be large, powerful, monied interests (cf. Microsoft, Amazon, and Yahoo, all of which recently [re-]joined the Open Content Alliance) lobbying to create specific provisions relaxing aspects of copyright control like those affecting orphan works and computational use. This differs from the current situation in which all the money and influence is on the other side. And they’ll have a legislator-friendly argument, namely that they’re just trying to compete in the marketplace on terms equal to Google’s. So far, they haven’t had to make this push, because no one has been making much money there. The settlement will change those incentives.

[Note in passing that the Berne Convention is always going to pose problems, since it’s built around absurdly strong European-style (“moral”) copyright provisions that prohibit things like registration requirements. The U.S. has never, of course, been especially keen on international agreements, but copyright protection is one of its long-standing hobby horses. It seems unlikely that the U.S. government would push for serious changes to Berne.]

An open market

There’s no reason to believe other entities won’t be able to enter the marketplace. The settlement provides only non-exclusive licenses to Google, and will serve as a ready-made template for a legal agreement between the rightsholders and any future scanners. Moreover, there would surely be serious antitrust scrutiny if the rightsholders were to withhold similar terms from others who wanted to enter the market. And why would they, really? More outlets means more differentiated products and more opportunities to sell their goods. Plus, with the registry already in place and both scanning and storage getting cheaper by the day, the barriers to entry are falling with time, not rising.

The status quo

What’s the alternative? If the settlement isn’t approved, no one can go ahead with any scanning projects. Not even those limited to the public domain (which, as noted, is less relevant by the day, because nothing new will ever fall into it); it would only take one mistaken scan of a protected work to expose a scanner to bankrupting litigation. Our current copyright system, written exclusively for content creators without even a nod to the public interest, will go on unchanged. And the public, academics and normal people alike, will have lost a terrifically promising resource, one assembled at significant cost and risk (if not with strictly altruistic motives) by a private company at almost no expense to us.

Library costs

Finally, libraries will, as always, have a choice to make about how they spend their subscription money, including whether or not to buy extended access to Google’s offerings. But they’ll already have free access (albeit at a single “terminal,” whatever that will mean in practice) to all of Google’s digital holdings. If prices are too high and they choose not to subscribe, they’ll still be better off than they were to begin with, since they’ll have one terminal with millions of in-copyright books, rather than none, as they do now. And how different is this situation from the one that holds with respect to commercial presses and journal publishers? Those publishers are already effective monopolies, and no one (alas!) seems to be suggesting legislation to change that fact. Do you think Google will be better or worse? How much do you pay for Google’s services now? Plus, if I’m right and other companies or not-for-profits enter the market, any monopoly concern disappears.

Summary

My argument here isn’t so different from the one progressives are now making about health care reform: The current situation is really, really bad. This plan makes things a lot better, with minimal downsides. I’d like real copyright reform as much as I’d like single-payer healthcare, but I think they’re about equally likely. So let’s not let the perfect be the enemy of the good.

Now, there’s a chance that a defeat for the settlement would be galvanizing in its own way, and that it would give rise to serious copyright reform. My own feeling is that if Eldred v. Ashcroft didn’t do it, nothing will. Maybe I’m wrong, but I’d much rather have Google Book Search and all it entails, plus the settlement-provided computational research corpus, a useful and well-funded rights registry (a significant public good), the plausible prospect of a thriving marketplace for digital texts and products based on them, and the first ever relaxation of at least a few copyright protections, than torpedo the settlement in hope of getting a marginally better legislative result that’s a huge longshot.

There are Parallels and there are Parallels

Finally, my answer to the first of the questions I posed about Disgrace. I’m answering it last because it’s the most important and because the solution depends on the positions one has taken on the others.

Question

  • What is the relationship between the two rapes?

Short answer: As a moral matter, they are unrelated. It is tempting to read them otherwise, but to do so produces unworkable interpretations of the novel as a whole.

It’s probably necessary to begin by affirming that there are indeed two rapes in the novel. Lucy’s is straightforward (that is, the fact that Lucy has been raped is never in dispute), but I’ve sometimes been asked (by colleagues and students alike) about Melanie’s, whether it’s altogether appropriate to call her treatment by Lurie rape. I believe it is, and that this isn’t a hard call. The problem, such as it is, is that the only account we have is Lurie’s own, from the much quoted passage in which he refers to his second sexual encounter with Melanie as “not rape, not quite that” (25). A fine distinction, coming from the perpetrator. But Melanie has already said no to his “words heavy as clubs” that “thud into the delicate whorl of her ear,” and she has struggled in his grasp (24–25). Once it is clear that “nothing will stop him,” she “does not resist [further]. All she does is avert herself … As though she had decided to go slack, die within herself for the duration, like a rabbit when the jaws of the fox close on its neck” (25). The language here is plain in its associations. There is also Lurie’s own acknowledgment that what he does to Melanie is “undesired to the core” (25). All this added to the necessarily unequal relationship that exists between them as teacher and student. One could go on, but this is surely enough. It is rape. The circumstances are different from Lucy’s, but that doesn’t change the nature of the crime in question, which is not simply “mistreatment” or “harassment.” (The latter terms are technically appropriate to the limited content of the inquiry, but not to the totally of Lurie’s actions.)

So, two rapes. Are they related? At a first level, no. The victims are different, the perpetrators are different, the circumstances are different, the associations are different. Lurie is involved, one way or another, in both, but not in a simple reversal as perpetrator and then as victim, which eliminates one potential point of direct continuity between them. At a second level, though, they obviously have something to do with one another. Lurie commits a rape that he is not inclined to see as an especially egregious crime. Later, he experiences at (nearly) first hand the trauma of rape and becomes its second-order victim (he is not raped, but his daughter is), after which (simplifying greatly) he expresses views about justice and punishment that are at odds with his earlier positions.

The second level, in which there is obviously some sort of connection between the rapes, raises questions that range in difficulty from “tricky” to “oh sweet Jesus.” Has Lurie reformed, having seen the error of his ways? Does the second rape offset the first? Which rape is worse? On what basis can one compare crimes in general and rapes in particular? How does sexual violation compare to other kinds of violation or loss or suffering? To what extent do historical and social factors mitigate or aggravate the seriousness of different crimes?

This is the stuff of philosophy and law. With respect to Coetzee’s novel, the answers will be terrifically complex and highly fraught. I, for one, don’t especially want to take them on in their full form, especially absent strong guidance from the text. But must we answer them? Well, we must if we think that Lucy’s rape serves as either a punishment for David’s crimes—be they literal or historical—or a commentary on their true nature. Why is that? Because punishment, as we often understand it, is supposed to fit the crime it redresses; if David suffers too much or too little as a result of Lucy’s rape, and if that rape serves as his punishment for raping Melanie, then his punishment will have been inappropriate. If Lurie in turn serves as a figure for South Africa’s privileged white minority, and if his crime stands in for his community’s historical abuses (both of which figurations I think are well supported in the text), then the novel’s position on the appropriateness of his punishment is critically important, assuming we do indeed see this as a matter of punishment.

The key here is that our usual concept of punishment is strongly related to our understanding of debt. See this post for a more complete argument, but the idea is that punishment is intended to extract from the perpetrator of a crime a loss equivalent to that imposed on the victim by the crime itself (plus an extra margin for deterrence). I think this is an imperfect but reasonable and ethical way to organize the law, though I don’t envy the task of equivalence-setting it imposes on legislators and judges. But is this really what Disgrace is up to, trying to figure out exactly how much a rapist should pay (in the currency of suffering) for his sins, or how much a privileged ethnic group should rightly expect to suffer at the hands of those it has wronged? And if so, what’s the answer? Does a beating, a grand theft auto, and the brutal rape of one’s daughter offset a milder rape one has committed oneself? (Incidentally, I use “milder” here in the sense of “a milder beating”; the comparison is obviously not to be confused with “mild” tout court.) Who among the current inheritors of racial exploitation must pay how much in the compensation of what suffering to whom?

These aren’t altogether absurd questions to ask of the novel, but I think Coetzee’s answer is something other than a straightforward position on the appropriate contours of compensation. Instead, the book says in effect “Who knows? As a practical, political, legal question, we’ll need to find a practical, political, legal answer, probably one that accounts for expediency as well as strict justice. This is important, but it’s not my concern (witness the obvious difficulties and unresolved tensions of the inquiry/TRC section). My concern is moral, about the ways in which one atones for one’s sins, if such a thing is possible.”

I think this question, about atonement, is the real core of the novel. And I think the answer is that as far as morality is concerned, debt doesn’t work as a model for atonement; you don’t repay your sins by suffering for them. The novel makes this point largely negatively, by showing the problems in which we become embroiled if we try to behave otherwise. Specifically, we end up needing to answer questions like the ones above; we need to say whether or not Lurie’s suffering is equivalent to Melanie’s, and whether Lucy’s is equal to those of apartheid’s victims. The book suggests that to do so in any strict sense is either impossible or objectionable, since it resembles much too closely a calculus of two (incalculable) wrongs making a right.

This leads us, finally, to a third view of the relationship between the rapes, which is that, so far as the book is concerned, they simply exist as brute facts in moral isolation. Their juxtaposition is an occasion for reflection, maybe even a prompt to ethical action, but they’re not meant to be weighed against one another. You can’t undo your sins, except to the extent that you can make your victims whole. If you can’t do that—and it’s not at all clear that one ever could, certainly not in the present case—your sins simply go on and on. You may well be more sinned against than sinning, but that’s not the point, since the relevant question doesn’t concern the balance of your moral accounts. The best you can do is to sin less (and less egregiously) in the first place and sin less in he future.

So that leaves us with Lurie as a sinful man, unredeemed and unredeemable. This was the point of many of my earlier thoughts about his actions; time and again, we’re presented with things that might be understood as redemptive, or at least as opportunities for redemption. And time and again they don’t pan out. This isn’t conditional; it’s required, the novel argues, by the nature of moral transgression itself.

This feels a little sketchy, but this post is already too long and I’ve gone on at some length about it elsewhere (see especially the paper linked from that post). One last thought, though. I promised to say something about the perfective (by which Lurie is seemingly fascinated), so here it is: We’re meant to understand many of the events in the second half of the novel as culminations of processes set afoot in the first. Lucy’s rape is the apotheosis of David’s sexual violence, its appalling perfection. David’s final abandonment of the dog is the perfection of his disgrace and destitution. The attack itself is the culmination of colonial exploitation, its necessary conclusion. The situation as a whole is one of finality, completion, the end of things (or, better, the end of a personal and historical situation) … perfection in the grammatical sense. And yet even these actions carried to perfection—to their retributive, debasing endpoint—don’t undo the sins that occasioned them. If that’s the case, then it’s hard to sustain the idea of atonement by (even perfect) sacrifice.

On Lucy’s Response to the Attack

Answers to two more of the basic questions about Disgrace, this time concerning Lucy’s reaction to her rape.

Questions

  • Why does Lucy refuse to report her rape or otherwise pursue legal remedy for it?
  • Why does Lucy remain on the farm after the attack?

Short answer to both: Because Lucy represents one, abnegatory pole of the novel’s imagined range of potential responses to guilt.

These questions are tricky, they’re related, and they’re absolutely central to making sense of the novel. Lucy offers several iterations of her thinking on both matters, always in conversation (of a sort) with Lurie. So we have her words, presented in direct dialogue, plus Lurie’s own speculation, both put directly to her and contemplated on his own.

Lucy’s words first. Immediately after the attack, she asks David “would you mind keeping to your own story, to what happened to you?” “You tell what happened to you,” she continues, “I tell what happened to me” (99). So from the beginning (minutes or hours after the attack), not only are their stories separable, but they are individual, personal. This is important, because it’s the basis of Lucy’s later claim that what happened to her isn’t properly public. “As far as I am concerned,” she says the next day, “what happened to me is a purely private matter” (112). This is true, according to Lucy, not because rape is always so, but owing to the historical contingencies of her situation:

“In another time, in another place it might be held to be a public matter. But in this place, at this time, it is not. It is my business, mine alone.”

“This place being what?”

“This place being South Africa.” (212)

Later still, however, Lucy confesses—in her only unprompted discussion of the attack with David—that she was baffled and shaken by exactly the personal investment of her attackers:

It was done with such personal hatred. That was what stunned me more than anything. The rest was … expected. But why did they hate me so? I had never set eyes on them. (156, ellipses in original)

It is in response to this question, in an attempt at palliation, that Lurie offers his much-quoted hypothesis that “it was history speaking through them … it may have seemed personal, but it wasn’t” (156).

The distinction here between personal and historical motives matters because it bears directly on Lucy’s reasons for not pressing charges and for staying on the farm. The problem, though, is that the public and private aspects of the attack and of her response to it are inseparable; Lurie is right that the attackers are motivated by impersonal forces, but that doesn’t preclude a deeply personal cathexis on Lucy as the specific object through or in which those forces find expression. So the attack is both personal, which means that these specific men remain a threat to Lucy, and impersonal, which means that apprehending them will do nothing to remove the general threat under which she lives.

That’s the attackers’ side of the equation; on Lucy’s, the problem is no simpler. As she says, what happened to her is personal; she has the right to respond to it as she sees fit and is under no obligation to treat it as a public matter (or, by the same token, as a private one). Moreover, what happened to her did not happen to David, except in the much different sense that his child was raped (an important point when we finally turn to the relationship between the novel’s rapes). But of course Lucy is aware that the attack was also fundamentally impersonal (that is, political) and that her response to it, whatever it might be, cannot but be public and political. She seems determined nevertheless to come as close as possible to removing herself from the public sphere, aware as she is of the reality that to press charges is to enter into a national phenomenon and debate concerning black-on-white violence. But it’s not as though simply accepting the situation avoids an entanglement with those same politics. That’s what she means when she says that what happened to her is a purely private matter: she can only treat it as private, else the consequences to her ethical and political self-image will be disastrous. “I must make the political decision,” she is saying, “to treat this as a nonpolitical matter.”

So that’s why she doesn’t press charges, because it is (to her) the least objectionable political response to an act that, under the circumstances, can only be construed politically.

Note that this addresses, at least in part, two of Lurie’s speculations about Lucy’s motives (see pp. 112 and 156). She doesn’t believe that by failing to press charges she will be spared further attacks, so hers is not an attempt to buy individual peace or reconciliation at the price of rape. I’m not so sure, though, that Lurie’s second guess—that she is trying to work out “some form of private salvation” (112)—misses the mark entirely. In the crudely psychologizing sense, yes, that’s wrong; she doesn’t want to suffer, and therefore doesn’t see the attack as a welcome opportunity for salvation. But in a political and moral sense, Lucy takes seriously the suggestion that she specifically and whites generally owe a historical debt to those who suffered under a social system that favored (and in a way continues to favor) people like her. “What if [another attack] is the price one has to pay for staying on? … Why should I be allowed to live here without paying?” (158), she muses. “Subjection. Subjugation,” she calls it, the new arrangement under which she is prepared to live (159).

Is this salvation? Probably not, and Lucy herself rejects the term as inappropriately religious. Still, she is plainly trying to work out an ethical sense of her obligations, ones that, as she acknowledges, are more (but not entirely) collective than individual.

This is then also the beginning of the answer to the second question, about why she stays on following the attack. Following the logic above, the problem is that she has not yet adequately atoned for her sins, not yet paid off her historical debt. So although she will likely be attacked again (or else give up to Petrus all she possesses in return for protection), that is what she understands to be demanded of her. This is at once absurd and obviously correct. Absurd because her suffering is so clearly out of proportion to her individual debt, particularly to these three men (whom she has “never seen before”—I suppose one could do something with the optics of power here, though I’m not so inclined). If she must expect, justly, to be raped, then nothing is prohibited in the aftermath of apartheid. But it’s also obviously correct if we accept the hypothesis that she is merely an object through which historical wrong begins to find redress; there is nothing she can give and nothing she can suffer that can possibly offset the wrongs of apartheid, which are very plainly greater than what can be undergone by any one person. If it is Lucy’s ethical obligation to compensate those victimized by her race, then she must be prepared to pay endlessly.

This is an unattractive conclusion, and I certainly don’t think it’s Coetzee’s, but I do think it’s one that the novel presents and works through, and I think it’s something that Lurie also confronts. (It would be worth thinking in particular about Life and Times of Michael K. in connection with this point, a novel that I think is underaddressed in studies of Disgrace.) My position is that Lucy is mistaken concerning the nature of her obligation, largely because she’s wrong about the metaphor of debt and repayment (see two earlier posts on ethics and debt, “Disgrace and Debt” and “Debt and Punishment“). But if she’s wrong, the error is hardly hers alone; it’s deeply embedded (rightly, perhaps) in our thinking about law and punishment. The novel’s suggestion, however, is that it’s misplaced in the case of moral transgression and obligation (again, see those earlier posts for the details of this argument).

There is also one further explanation of Lucy’s decision to stay on, one that’s closer to her own direct claims, namely that she has kind of blind, unreflective compulsion to carry on carrying on. “Guilt and salvation are abstractions. I don’t think in those terms,” she says (112). Later she writes (to Lurie) “I am a dead person and I do not know yet what will bring me back to life. All I know is that I cannot go away” (161). Both of these declarations mirror her claim, the day after the attack, that although Lurie sees returning to the farm as “a bad idea … not safe,” “it was never safe, and it’s not an idea, good or bad. I’m not going back for the sake of an idea. I’m just going back” (105).

There’s a certain modified sense of existentialist ethics here insofar as Lucy remains committed to a kind of cause, but it is absent the important component of a meaningful decision or any guarantee that the course itself is a laudable one. And that’s a significant “but.” In fact it’s this blind drive that makes Lucy serve more as a totem for, or principle of, atonement without limit than as a full ethical actor in her own right. If there’s a full-fledged individual ethical figure in the novel—and I’m not at all convinced there is—it will therefore have to be Lurie or no one.

That’s it for questions two and three. One to go, the most difficult of all.

On the Killing of Dogs

Answers in this post to two more of the baseline questions about Disgrace, both of which concern Lurie and the significance of his relationship to dogs.

Questions

  • In what sense, if any, is Bev and Lurie’s euthanasia of the dogs an ethical/merciful/loving act?

Short answer: As ethical in sum, but complicated by the awareness that one could always do more.

  • Why does Lurie give up the dog at the end of the novel?

Short answer: From kindness and in atonement, but with persistent overtones of the flagellant’s self-involvement.

Hmm … these questions are getting harder. First, to the text: “The dogs suffer … most of all from their own fertility. There are simply too many of them” (142). Note in passing a related allusion to Jude the Obscure in “because we are too menny” (146), which adds an obvious note of pathos. What Bev and Lurie offer isn’t treatment (they are untrained as vets and lack the resources to do much for the animals even if they were) but the relief of a painless death. Bev, we are told, has a particular talent for “easing the passage” of each dog, to which she “gives her fullest attention” (142). Lurie is less skilled, but finally no less devoted. He is shaken by the work, and takes over, metaphorically speaking, the role of “dog-man” from Petrus. It is he who takes the dogs’ corpses to the incinerator and feeds them individually into the flames in order, he says, to preserve an image of the world “in which men do not use shovels to beat corpses into a more convenient shape for processing” (146). Finally and maybe most importantly, there is this description of their shared enterprise:

Sunday has come again. He and Bev Shaw are engaged in one of their sessions of Lösung. One by one he brings the cats, then the dogs: the old, the blind, the halt, the crippled, the maimed, but also the young, the sound—all those whose term has come. One by one Bev touches them, speaks to them, comforts them, and puts them away, then stands back and watches while he seals up the remains in a black plastic shroud.

He and Bev do not speak. He has learned now, from her, to concentrate all his attention on the animal they are killing, giving it what he no longer has difficulty calling by its proper name: love. (218-19)

A couple of things here. The fact that this is all happening on Sunday (as do all the euthanasia sessions; cf. “What the dog will not be able to work out (not in a month of Sundays!)” [219]), plus the persistently religious rhetoric (the communion-like procession of animals “one by one”; the crippled, the maimed, etc.; the shroud; love), plus the scene’s position on the penultimate page of the novel, push any reading toward terms of redemption. On the other hand, there’s the Holocaust-related Lösung (cf. Elizabeth Costello and The Lives of Animals), plus the decidedly non-euphemistic “killing,” plus Lurie’s decision at the end of this scene to give up Driepoot, the dog he could have saved (on which more below). So there’s real tension here, and we should be careful about flattening it out in any interpretation of the acts.

My take is that we are to understand Bev’s actions as driven by a love of animals generally, whose collective suffering it is her project to minimize. This is plainly both ethical and laudable. That part’s easy enough, and if the book went no further, things would be pretty simple (and uninteresting) on this point. It might be distasteful to kill the dogs, but it wouldn’t be a shattering experience. The reason, then, that the killing is so difficult—why it “gets harder all the time”—is that what’s good for the dogs collectively is in at least some cases (and perhaps most of them) bad for any one dog. That’s to say that while there clearly are instances in which killing an individual dog is an act of kindness (as when we put an aging pet “to sleep,” a euphemism the novel refuses), Bev and Lurie are aware that their toll includes as well “the young, the sound”—dogs, in short, that might go on living happily enough past the day of their execution, were it possible to do so.

A relevant question, then, is whether or not it is possible to save any one dog, something Lurie contemplates in the case of Driepoot:

He can save the young dog, if he wishes, for another week. But a time must come, it cannot be evaded, when he will have to bring him to Bev Shaw in her operating room (perhaps he will carry him in his arms, perhaps he will do that for him) and caress him and brush back the fur so that the needle finds the vein, and whisper to him and support him in the moment when, bewilderingly, his legs buckle; and then, when the soul is out, fold him up and pack him away in his bag, and the next day wheel the bag into the flames and see that it is burnt, burnt up. He will do all that for him when the time comes. It will be little enough, less than little: nothing.

He crosses the surgery. “Was that the last?” asks Bev Shaw.

“One more.”

He opens the cage door. “Come,” he says, bends, opens his arms. The dog wags its crippled rear, sniffs his face, licks his cheeks, his lips, his ears. He does not stop it. “Come.”

Bearing him in his arms like a lamb, he re-enters the surgery. “I thought you would save him for another week,” says Bev Shaw. “Are you giving him up?”

“I am giving him up.” (219-20)

It is Lurie’s claim that he cannot save the dog, not in the long run, and if that’s true of this dog, it is surely true of any dog. I will admit frankly that I find this scene, which ends the novel, deeply affecting. And so I want to believe what Lurie says, that he can’t save the dog, not beyond the next week or the week after that, thus that his decision to give him up is a matter of mature acceptance rather than weakness. Such a reading works nicely with a larger interpretation of the narrative as one of ethical development and redemption, showing Lurie as finally able to cease clinging to his own interest (he enjoys the dog’s affection and company, even if he holds himself slightly aloof from it, knowing its inevitable fate) and to do instead what must be done, even at significant cost to himself.

[Incidentally, Lurie’s fascination with the perfective (“burnt, burnt up”) is a recurring feature of the novel and might certainly be related to this reading, both concerning the dog and his own developmental arc. Why this is the case might have been elevated to question of its own. There’s a bit more on it below, and if I remember I’ll say something more about it in connection with question one, on the relationship between the two rapes.]

I think there’s much in the novel that points toward this reading, which sees the “giving up” of Driepoot as the symbolic culmination (perfection) of a larger process of learning to live with less (or with nothing), i.e., without the historically determined advantages of Lurie’s initial position. This is a reading that restores maximum political advantage to the novel, though it might also be possible to tint it with irony and to complain (crudely, I think) that Coetzee thus likens the abuses of apartheid to losing a (potential) pet.

Still, even the unironic version of this reading strikes me as too neat by half. It aligns Lurie too closely with Lucy, who (for reasons I’ll talk about in a later post) serves as one (flawed) pole of a potential response to the crimes of history. It also removes much of the productive ambiguity that accounts for the bulk of the novel’s value; it is an intriguing and valuable book precisely because it refuses to give us obvious moral precepts (James Wood is wrong on this point), or perhaps more accurately because it takes seriously the defects of a field of related and competing precepts.

What’s the alternative? Well, that Lurie might indeed have saved the dog indefinitely and that while doing so would constitute only a small mercy in a field of enormous suffering, it would still be better than the alternative. The issue turns in part, I suppose, on what we’re to make of Driepoot’s life at the clinic, assuming Lurie cannot simply adopt him as a proper pet: the dog’s existence seems dull, of course, and he’s crippled, but he appears otherwise happy and certainly enjoys Lurie’s company. It’s hard not to feel a bit foolish trying to think this through, evaluating the utilitarian happiness of a lightly-described fictional dog, and I suspect I’d be unhappy with Coetzee if Lurie did save it (cheap sentiment, an animal better loved than humans, a minimal piece of literal grace or salvation to close the novel, etc.). But still, it’s not clear to me that killing the dog is as strictly required as Lurie claims; Lurie has come far down in station, but he’s not so utterly destitute that he can’t afford a bit of kibble. What he clearly cannot do, however, is save every other dog in Driepoot’s position.

The larger question, which hangs over not only this novel but much of Coetzee’s fiction, is what to do about problems so large that any individual action is dwarfed in proportion to it. Saving a dog will do almost nothing (“little enough, less than little: nothing”). The quote is pulled slightly out of context (it refers to what Lurie will do for Driepoot as the dog dies) but still, the slide from “little enough” to “nothing” is one that we ought not to overlook, which is the novel’s point. Lurie helps the dog’s passage because it is not nothing, even if on another level the result is the same in the end. The dog is still dead, but its final moments have been easier than they would have been otherwise.

The consequence of this final action, though, is to leave Lurie without salvation except in the most roundabout of ways. He cannot or will not save the dog, he cannot save Africa’s dogs, he cannot find companionship with an animal in lieu of a human. He has almost nothing. At best, he eases the death of dogs that do not wish to die and ought not—in a better world he is powerless of bring about—to need to. If there is atonement, it is through his affirmative role in the mortification of his own experience by killing Driepoot as a part of his (Lurie’s) world, that is, by seeing the dog as a part of himself that he is willing to give up. But to do that is to instrumentalize the dog as Lurie has done with people in so many earlier cases. A lesser sin, surely, than most of the others, but of a piece with them. If he is reformed, the novel tells us in the closing line, it is imperfectly.

Note that one of the features of our investigation of this and almost every other question to this point has been to undermine the prospects for reading the novel redemptively, with the result that it’s looking harder and harder to construct such a reading. Properly critical questions to keep in mind, then: Is it a problem if there’s no redemption here? And if there’s none, what’s the novel about anyway?

Answers to Some Questions about Lurie

Continuing with answers to the prerequisite questions about Disgrace, a couple on Lurie.

Question

  • How are we to treat Lurie’s opera?

Short answer: As an aesthetic failure, thus as non-redemptive.

It’s tempting, I think, to see the opera first as an opportunity for social rehabilitation—which is Lurie’s own early hope for it, though never his primary motivation—and second as a type of compensation for his sins. “It would have been nice,” he thinks, “to be returned triumphant to society as the author of an eccentric little chamber opera. But that will not be” (214). Despite his resignation on this point, he continues to hope (and this is the basis of the second reading, of the opera as compensation) that the work will contain “a single note of immortal longing” (214), but has given up on recognizing such a note himself (leaving the question instead to “history”). Really, though, Lurie has already abandoned any hope concerning the piece’s value:

The truth is that Byron in Italy is going nowhere. There is no action, no development, just a long, halting cantilena hurled by Teresa into the empty air, punctuated now and then with groans and sighs from Byron offstage. … It has become the kind of work a sleepwalker might write.” (214)

We needn’t take Lurie’s word for it, of course, but we don’t have much else to go on and there’s no obvious reason to believe that he’s underestimating his own work’s merit. In light of the quotes above, the burden of proof should certainly lie with anyone who wants to read the work as redemptive, and I’ve not yet seen a compelling case for such a position. Still, there are a handful of readers who disagree to at least some extent: There’s Anker in the MFS article, for axample, and Derek Attridge’s chapter on Disgrace in J.M. Coetzee and the Ethics of Reading (see p. 174ff).

This matters, one way or the other, because it bears on the issue of redemption or reconciliation in the novel, hence also on any political reading linked to either contemporary South Africa or post-exploitative situations. Art as redemption or expiation is an old, old theme; if Coetzee’s point is something along those lines, the novel is, as far as I’m concerned, seriously diminished. Happily, I don’t see any reason to read it that way, and plenty of textual evidence on the other side. Redemptive art is a rejected alternative in the novel, not its proposed solution to the problem of guilt and atonement.

Question

  • Why does Lurie sleep with Bev, and she with him?

Short answer: From a combination of compassion and desire, the precise amalgam of which remains uncertain.

This is trickier than it seems, though I go back on forth on how important it might be. As usual, we’re limited by Coetzee’s focalization of the narrative through Lurie alone. First, we should note in passing that we know for certain of only one time that they sleep together (the first, pp. 148-50); later we are told that although they “lie in each other’s arms” after “the business of dog-killing is over for the day,” they have “not made love; they have in effect ceased to pretend that that is what they do together” (161-62). It’s hard to gauge how much time has passed since their first encounter twelve pages earlier; enough that they can have “long since” ceased to pretend. In any case, we’re not privy to any further details of their affair, if that’s the right word for it.

As for the whys and wherefores of it, we can choose to believe Lurie’s explanation of Bev’s role, that she has acted so that “he, David Lurie, has been succoured, as a man is succoured by a woman; her fiend Lucy Lurie has been helped with a difficult visit” (150). This interpretation, that Bev is largely free of desire and principally other-directed in the affair, isn’t entirely implausible. Bev is an intensely ethical figure, almost sainted in the novel (unless her acts of euthanasia are somehow undermined; more on this later in answer to question six), so it wouldn’t be a stretch to see her as similarly directed with respect to Lurie (who has, after the attack, presumably lost most of whatever physical charms he once had, further reinforcing his status as an object of pity). At the same time, though, there are plenty of reasons to doubt the accuracy of Lurie’s assessments concerning other people in general and women in particular. Even if he’s right that one of the reasons she sleeps with him is a keenly developed sense of altruism, that judgment would be of a piece with his generally egocentrtic mode of explaining things. So we should probably keep open the possibility that Bev has her own reasons and desires (to which we are not directly privy), rather than taking the affair entirely as evidence of her selflessness.

As for Lurie, the proffered explanation seems likewise to fall between desire and something like compassion. On one hand, it may simply be the case that he will sleep with more or less anyone (there have been hundreds, he says, over the years [192]), Bev merely serving as an illustration of exactly how far he has fallen from “the sweet young flesh of Melanie Isaacs,” of “what [he] will have to get used to” (150). But there’s at least a little more to it than that; Lurie is aware, when he explains Bev’s actions to himself, that her self-image is bound up with her ability to provide succor, hence that it is important that he “does his duty. Without passion but without distaste either. So that in the end Bev Shaw can feel pleased with herself” (150). This may not be much as other-directed ethics go, but it’s not nothing.

So Bev begins the affair from generosity and perhaps a bit of pity, mixed with some presumptive degree of desire, while Lurie is motivated by a baseline desire and a species of second-order generosity. Fair enough, if unexceptional. Why does this matter? A couple of reasons. For one, the affair might in a pinch serve as a mode of redemption for Lurie, particularly insofar as it transcends physical desire (the proximate cause of his initial disgrace). I find this option unconvincing, since it both oversimplifies the affair itself and implies a monastic morality, in effect turning the book into an object lesson in the evils of the flesh. Despite the novel’s obvious interest in the problems of aging and its effects on the body, there’s no generalized distaste for sexuality in Disgrace (nor elsewhere in Coetzee’s writing), and it strikes me as a serious error to introduce one here. More importantly, Bev’s motivations are of interest insofar as they are necessarily related to her euthanasia work, which is likewise presented as an act of generosity. Are these instances of the same impulse in her? How does the answer to that question shape our reading of her work with the dogs, and of the role of dogs generally in the novel? See the answer to the next questions for some thoughts.

The Inquiry and Lurie’s Defense

Here’s the first of my answers to the questions about Disgrace raised in this post. The questions were posed in arbitrary order, but the answers try to work forward in some semi-logical way, hence they don’t follow the original order.

Question

  • Why does Lurie give up his job by refusing to defend himself before the inquiry?

Short answer: Because he doesn’t fear the consequences of doing so.

This question is leadingly posed; it might instead have read “Is Lurie mistreated by the university’s commission of inquiry?”

There’s not a lot of critical consensus on this point (in either version), and it’s an important one. If you think that Lurie is mistreated (see Anker in MFS, for example), then you have two options in reply to the question of why he doesn’t do more to defend himself. On one hand, you might deny that he fails to offer a vigorous defense, which means that you’d need to take seriously both his claims about the “rights of desire” and the perceived adequacy of that defense. You would, in other words, need to take Lurie at his word, to treat him as straightforwardly sincere (and as having badly miscalculated the effect of his testimony). I know of no critic who has espoused exactly this position in print, though it is not entirely unsupported in the text (Rosalind, for instance, imputes it to Lurie, and he sometimes—but not often—sounds like he takes it seriously himself).

On the other hand, you might instead see him as a kind of martyr, one who is unwilling to offer an insincere confession or apology even when he knows that his (honestly given) defense will not save him from punishment. This is Anker’s position, which she uses to read the novel as a critique of the inquiry’s (and by proxy the TRC’s) human rights discourse. It’s not wholly implausible (that’s as positive as I can be about it), but notice that it almost certainly commits you to seeing Lurie as the aggrieved party in the aftermath of his affair with Melanie, and hence as its true, noble victim, one unwilling to compromise his principles for politically motivated expediency. This is the basis of many readings of the novel that find its politics objectionable (e.g., Roos’ review in the Cape Argus), primarily because it is taken to suggest that, like Lurie, whites have paid too great a personal and political price in postapartheid South Africa, that they suffer out of proportion to their crimes. (This is a position that also depends, clearly, on additional evidence, mostly in connection with Lucy’s rape; more on this when I come to later questions.)

[Incidentally, the reader will also need to decide, in any case, what offense the inquiry is attempting to punish. Is it Lurie’s rape of Melanie, or merely the fact of their affair? For reasons I laid out in this post, I think it’s unlikely that the formal charges against Lurie include rape. But I also think it’s proper for the reader to understand the inquiry as addressing the sum of Lurie’s transgressions, rape included; if you want to argue that Lurie is wronged, you need to claim that his punishment is excessive relative to all the facts we readers know, not solely those contained in Melanie’s statement (which is withheld from the reader entirely). Ditto, of course, Lucy’s incomplete police report, on which more in a later post.]

If we answer the implied “Is Lurie wronged?” question in the negative, we likewise have two potential explanations of his meager defense. Well, OK, two and a half: The “half” is a rejection of the premise identical to the one above, i.e., the claim that Lurie does offer what he seriously considers to be an effective defense, but that he is simply (and badly) mistaken in this judgment. Again, I find little textual evidence in favor of this position. More plausible are two other readings: (1) That he recognizes the justice of his relatively severe punishment and therefore refuses to shirk it by defending himself more effectively, or (2) That he does not consider the punishment on offer particularly severe, and that he is therefore unwilling to make even relatively small sacrifices to avoid it.

The first case, I think, credits Lurie with entirely too developed a sense of justice and personal culpability at too early a point in the text. It’s an open question, it seems to me—or at least a very difficult and important one—whether or not he has by the end of the novel achieved anything like this level of ethical awareness. But while he’s not blind to the ethical dimensions of the affair while it’s taking place, neither does he seem especially troubled by them, and certainly not so much as to accept the justice of his punishment.

Which leaves us with the second possibility, namely that Lurie refuses to go along with the compromises offered to him because he does not regard the loss of his position at the university as particularly troubling, nor in any case as worth sacrificing his mildly Byronic self-image to preserve. He is by his own description an indifferent teacher, he has at best a dutiful interest in his subject matter (communications rather than literature), he is no more than a modest scholar, he’s fifty-two years old, in no fear (perhaps erroneously) of losing his pension, and he has both other projects to occupy his time (the Byron opera) and alternative practical arrangements (at Lucy’s farm) to support himself. He has few enough friends, it seems, at the university or in Cape Town, so his loss of social standing is moderate, and would likely be little better even if he were to keep his job by admitting fault and undergoing counseling. So why, finally, should he bend at all far to achieve a solution that on the whole may be worse (as far as he’s concerned at the time) than the worst-case outcome of the inquiry?

This last option strikes me as by far the most plausible, and it has several advantages as part of a larger reading of the novel. Most importantly, it avoids any suggestion of Lurie as a victim of the commission, which in turn preserves more interpretive options with respect to the later attack and reduces the risk of needing to read the novel (against all evidence of Coetzee’s own political convictions, not that these need necessarily be controlling) as politically objectionable. It also avoids sanctifying Lurie from the outset, which would produce a very static reading indeed. Finally, it preserves a sense of Lurie as a plausible character in his own right, rather than making him a solely allegorical figure. This last point is important not because we’re trying to avoid allegory (why and how could we?), but because the allegorical reading we eventually do construct will be much richer if it’s built up from complex characters than from simple ones.

So that’s one question down, many to go. More to come …

Eight Questions about Disgrace

A while back, I blogged about some issues in Coetzee criticism. As I’m continuing work on my own essay about Disgrace, I’ve come up with a list of questions that I think every critic should be able to answer about the book before writing on it. These aren’t the only relevant questions, of course, nor does answering them constitute criticism proper. But they’re the prerequisites of criticism; if you can’t take and defend a position on each of them, you haven’t thought hard enough about the novel and its tensions to offer a coherent reading of the work as a whole.

The questions, in loose order of dependence (but definitely out of textual order):

  1. Why does Lurie give up his job by refusing to defend himself before the inquiry?
  2. Why does Lurie sleep with Bev, and she with him?
  3. How are we to treat Lurie’s opera?
  4. Why does Lurie give up the dog at the end of the novel?
  5. In what sense, if any, is Bev Shaw’s (and Lurie’s) euthanasia of the dogs an ethical/merciful/loving act?
  6. Why does Lucy refuse to report her rape or otherwise pursue legal remedy for it?
  7. Why does Lucy remain on the farm after the attack?
  8. What is the relationship between the two rapes?

[Note: Links from each question above point to the post with my answer to it.]

As I say, certainly not the only questions one could or should ask about the novel. But they’re crucial because they address the specific content of Coetzee’s allegorical meaning. It’s not enough to claim that the book is, for example, an allegory of South African society after apartheid (which is to say almost nothing at all, yet seems to satisfy many critics); you need to work out the tenor of that allegory. And it turns out that that’s a difficult and fraught thing to do, because it requires you to take positions on questions like these about which the novel is ambivalent or ambiguous or flatly contradictory. But that’s why we get paid the big bucks, isn’t it?

My own answers to each of these in the coming days …

POS Frequencies in the MONK Corpus, with Additional Musings

This post is on the work I presented at DH ’09, plus some thoughts on what’s next for my project. It’s related to this earlier post on preliminary part-of-speech frequencies across the entire MONK corpus, but includes new material and figures based on some data pruning and collection as mentioned in this post (details below).

A word, first, on why I’m working on this. I don’t really care, of course, about the relative frequencies of various parts of speech across time, any more than chemists care about, say, the absorption spectra of molecules. What I’m looking for are useful diagnostics of things that I do care about but that are hard to measure directly (like, say, changes in the use of allegory across historical time or, more broadly, in rhetorical cues of literary periodization).

My hypothesis is that allegory should be more prominent and widespread in the short intervals between literary-historical periods than during the periods themselves. Since we also suspect that allegorical writing should be “simpler” on its face than non-allegorical writing (because it needs to sustain an already complicated set of rhetorical mappings over large parts of its narrative), it makes sense (in the absence of a direct measure of “allegoricalness”) to look for markers of comparative narrative simplicity/complexity as proxies for allegory itself. I think part-of-speech frequency might be one such measure. In any case if I’m right about allegory and periodization and if I’m also right about specific POS frequencies as indicators of allegory, then we should expect certain POS frequencies to exhibit significant (in the statistical sense) fluctuations around periodizing moments and events. (I wish there were fewer ifs in that last sentence; I’ll say a bit below about how one could eliminate them.)

So … what do we see in the MONK case? Recall that the results from the full dataset looked like this:

POS Frequencies, Full MONK Corpus

POS Frequencies, Full MONK Corpus

But that’s messy and not of much use. It doesn’t focus on the few POS types that I think might be relevant (nouns, verbs, adjectives, adverbs); it includes a bunch of texts that aren’t narrative fiction (drama, sermons, etc.); and it’s especially noisy because I didn’t make any attempt to control for years in which very few texts (or authors) were published. (Note that the POS types listed are the reduced set of so-called “word classes” from NUPOS.)

Here’s what we get if we limit the POSs (PsOS?) in question, exclude texts that aren’t narrative fiction, and group together the counts from nearby years with low quantities of text:

POS Frequencies, Reduced and Consolidated MONK Corpus

POS Frequencies, Reduced and Consolidated MONK Corpus

And here’s the same figure with the descriptive types (adjectives and adverbs) added together:

POS Frequencies, Reduced and Consolidated MONK Corpus (Adj + Adv)

POS Frequencies, Reduced and Consolidated MONK Corpus (Adj + Adv)

[Some data details, skippable if you don’t care. First, note that the x axes in all three figures need to be fixed up; they’re just bins by year label, rather than proper independent variables. I’ll fix this soon, but it doesn’t make much difference in the results. You can download the raw POS counts for the full corpus (not sorted by year of publication), as well as those restricted to texts with genre = fiction. These are interesting, I guess, but more useful are the same figures split out by year of publication, both for the whole corpus, and just for fiction (presented as frequencies rather than counts). Finally, there are the fiction-only, year-consolidated numbers (back to counts for these, because I’m lazy). The table of translations between the full NUPOS tags and the (very reduced) word classes presented here is also available.]

So what does this all mean? The first thing to notice is that there’s no straightforward confirmation of my hypotheses in these figures. There’s some meaningful fluctuation in noun and verb frequency over the first half of the nineteenth century—which I think might be an interesting indication of the kind of writing that was dominant at the time (see the noun and verb frequency section of this post)—but no corresponding movement in the combined frequency of adjectives and adverbs. This might mean several things: I might be wrong about the correlation between such frequencies and periodizing events, or I might not be looking at the right POS types, or (quite likely, regardless of other factors) I might not have low enough noise levels to distinguish what one would expect to be fairly small variations in POS frequency.

Where to go from here? A few directions:

I’ll keep working on a bigger corpus. The fiction holdings from MONK are only about 1000 novels, spread (unevenly) over 120+ (or 150+) years. So we’re looking at eight or fewer books on average in any one year, and that’s just not very much if we want good statistics.

There are a couple of ways to go about doing this. Gutenberg has around 10,000 works of fiction in English, so it’s an order of magnitude larger. There are issues with their cataloging and bibliographic quality, but I think they’re addressable and I’m at work on them now. The Open Content Alliance has hundreds of thousands of high-quality digitizations from research libraries, though there are some cataloging issues and I’m not sure about final text quality (which relies on straight OCR rather than hand-correction as does Gutenberg). Still, OCA (or Google Books, depending on what happens with the proposed settlement, or Hathi) would offer the largest possible corpus for the foreseeable future. I’ve been talking to Tim Cole at UIUC about the OCA holdings and will report more as things come together.

But I think it’s also worth asking whether or not POS frequencies are the right way to go; I started down that path on a hunch, and it would be nice to have some promising data before I put too much more effort into pursuing it. What I need, really, are some exploratory descriptive statistics comparing known allegorical and nonallegorical texts. One of the reasons I’ve held off on doing that was because it seems like a big project. The time span I have in mind (several centuries), plus the range of styles, genres, national origins, genders, etc. suggest that the test corpus would need to be large (on the order of hundreds of books, say) if it’s not to be dominated by any one author/nation/gender/period/subject/etc. But how much reading and thinking would I have to do to identify, with high confidence, at least 100 major works of allegorical fiction and another 100 of comparable nonallegorical fiction? And would even that be enough? A daunting prospect, though it’s something that I’m probably going to have to do at some point.

But I got an interesting suggestion from Jan Rybicki (who works in authorship attribution, not coincidentally) at DH. Maybe it would suffice, at least preliminarily, to pick a handful of individual authors who wrote both allegorical and nonallegorical works reasonably close together in time, and to look for statistical distinctions between them. Since I’d be dealing with the same author, many of the problems about variations in period, national origin, gender, and so forth would go away, or at least be minimized. I suspect this wouldn’t do very well for finding distinctive keywords, which I imagine would be too closely tied to the specific content of each work (which is a problem that the larger training set is intended to overcome), but it might turn up interesting lower-level phenomena like (just off the top of my head) differences in character n-grams or sentence length. It would take some work to slice and dice the texts in every conceivably relevant statistical way, but I’m going to need to do that anyway and it’s hardly prohibitive.

So that’s one easy, immediate thing to do. In the longer run, what I really want is to see what people in the field have understood to be allegorical and what not, which would have the great advantage, at least as a reference point, of eliminating some of the problems of individual selection bias. One way to do that would be to mine JSTOR, looking, for example, for collocates of “allegor*” or (more ambitiously) trying to do sentiment analysis on actual judgments of allegoricalness. I suspect the latter is out of the question at the moment (as I understand it, the current state of the art is something like determining whether or not customer product reviews are positive or negative, which seems much, much easier than determining whether or not an arbitrary scholarly article considers any one of the several texts it discusses to be allegorical or not). But the former—finding terms that go along with allegory in the professional literature, seeing how the frequency of the term itself and of specific allegorical works and authors changes over (critical) time, and so on—might be both easy and helpful; at the very least, it would be immensely interesting to me. So that’s something to do soon, too, depending on the details of JSTOR access. (JSTOR is one of the partners for the Digging into Data Challenge and they’ve offered limited access to their collection through a program they’re calling “data for research,” so I know they’re amenable to sharing their corpus in at least some circumstances. I was told at THATCamp by Loretta Auvil that SEASR is working with them, too.)

[Incidentally, SEASR is something I’ve been meaning to check out more closely for a long time now. The idea of packaged but flexible data sources, analytics, and visualizations could be really powerful and could save me a ton of time.]

Finally (I had no idea I was going to go on so long), there are a couple of things I should read: Patrick Juola’s “Measuring Linguistic Complexity” (J Quant Ling 5:3 [1998], 206-13)—which might have some pointers on distinguishing complex nonallegorical works from simpler allegorical ones—plus newer work that cites it. And Colin Martindale’s The Clockwork Muse, which has been sitting on my shelf for a while and which was (re)described to me at DH as “brilliant and infuriating and wacky.” Sign me up.