David Vadas

I'm a researcher in the field of Computational Linguistics. My work has been in the area of statistical parsing, specifically looking at the structure of noun phrases (NPs). I've also had a lot of experience with Combinatory Categorial Grammar (CCG) and the C&C parser.


Noun Phrases

Here is the NP data I annotated in the Penn Treebank. Read my thesis for the most in depth description of the annotation process, the data itself, and how I used it.

Version 1.0

In order to use the data file, you need the Penn Treebank 3 corpus. The guidelines describe the new annotations that have been added to the corpus.

Version 0.9

There is also an older version that was used in the original paper: Adding Noun Phrase Structure to the Penn Treebank, which has slightly different guidelines.


This data is the result of the CCGbank conversion process reported in Parsing Noun Phrase Structure With CCG. To use the data file, you need the CCGbank corpus.