| Corpora for biomedical natural language processing | |
|
A project of the Biomedical Text Mining Group at the Center for Computational Pharmacology Lab: RC-1 S. Room L18-6400A Phone: 303-916-2417 E-mail: Kevin.Cohen@gmail.com
|
| Home | Obtaining corpora | Publications | Empirical data on corpus usage | Corpus design | Survey data | |||
|
Empirical data on corpus design and usage in biomedical natural language processingThis page provides a link to, and supplemental material related for, our 2005 AMIA paper. The paper provides data on the usage of various biomedical corpora. It discusses the formats of a couple of corpora that have not been frequently used outside of the labs that built them, and suggests that the format in which annotations are recorded and distributed is an important factor in determining whether or not a corpus will be widely adopted. At the suggestion of a reviewer, we performed a survey designed to elicit user preferences regarding various aspects of corpus design and contents. To see the survey and results, follow this link.
|