POS Taggers

Does anyone have an opinion on the relative merits of the various part-of-speech taggers? I’ve used (and had decent luck with) Lingpipe, which seems pretty quick and very accurate in my limited tests. I also just read a post by Matthew Jockers about the Stanford Log-linear Part-Of-Speech Tagger (which is what got me thinking about this; I admit I was largely sucked in by the discussion of Xgrid, which I’d really like to try). And I thought the Cornell NLP folks had one, too, though I now can’t find any reference to it, so I may well be wrong. Plus there’s MONK/Northwestern’s MorphAdorner (code not yet generally available, though I don’t think it would be a problem to get it), and any number of commercial options (less attractive, for many reasons).

I surely just need to test a bunch of them is some semi-systematic way, but is there any existing consensus about what works best for literary material?

2 thoughts on “POS Taggers”

Pingback: Matthew Wilkins Evaluates POS Taggers « LingPipe Blog ~
Jochen L. Leidner says:

February 2, 2009 at 6:09 pm

I think this area has been neglected because as soon as POS taggers performed 98% on *some* corpus, researchers moved on to statistical parsing…

Some pointers:

http://portal.acm.org/citation.cfm?id=520794.878799&coll=&dl=

Click to access atwell.pdf

http://nora.hd.uib.no/corpora/1997-3/0161.html

Reply

Work Product

Research notes in quantitative humanities

Menu

POS Taggers

2 thoughts on “POS Taggers”

Leave a comment Cancel reply

Menu

Share this:

Related

2 thoughts on “POS Taggers”

Leave a comment Cancel reply