We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
Data preparation and interannotator agreement: BioCreAtIvE task 1B.
- Authors
Colosimo, Marc E; Morgan, Alexander A; Yeh, Alexander S; Colombe, Jeffrey B; Hirschman, Lynette
- Abstract
We prepared and evaluated training and test materials for an assessment of text mining methods in molecular biology. The goal of the assessment was to evaluate the ability of automated systems to generate a list of unique gene identifiers from PubMed abstracts for the three model organisms Fly, Mouse, and Yeast. This paper describes the preparation and evaluation of answer keys for training and testing. These consisted of lists of normalized gene names found in the abstracts, generated by adapting the gene list for the full journal articles found in the model organism databases. For the training dataset, the gene list was pruned automatically to remove gene names not found in the abstract; for the testing dataset, it was further refined by manual annotation by annotators provided with guidelines. A critical step in interpreting the results of an assessment is to evaluate the quality of the data preparation. We did this by careful assessment of interannotator agreement and the use of answer pooling of participant results to improve the quality of the final testing dataset.
- Publication
BMC bioinformatics, 2005, Vol 6 Suppl 1, pS12
- ISSN
1471-2105
- Publication type
Journal Article
- DOI
10.1186/1471-2105-6-S1-S12