An Approach to Reducing Annotation Costs for BioNLP

dc.contributor.authorBloodgood, Michael
dc.contributor.authorVijay-Shanker, K
dc.date.accessioned2014-08-26T19:27:04Z
dc.date.available2014-08-26T19:27:04Z
dc.date.issued2008-06
dc.description.abstractThere is a broad range of BioNLP tasks for which active learning (AL) can significantly reduce annotation costs and a specific AL algorithm we have developed is particularly effective in reducing annotation costs for these tasks. We have previously developed an AL algorithm called ClosestInitPA that works best with tasks that have the following characteristics: redundancy in training material, burdensome annotation costs, Support Vector Machines (SVMs) work well for the task, and imbalanced datasets (i.e. when set up as a binary classification problem, one class is substantially rarer than the other). Many BioNLP tasks have these characteristics and thus our AL algorithm is a natural approach to apply to BioNLP tasks.en_US
dc.identifierhttps://doi.org/10.13016/M2VC7V
dc.identifier.citationMichael Bloodgood and K. Vijay-Shanker. 2008. An approach to reducing annotation costs for BioNLP. In Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing, pages 104-105, Columbus, Ohio, June. Association for Computational Linguistics.en_US
dc.identifier.urihttp://hdl.handle.net/1903/15584
dc.language.isoen_USen_US
dc.publisherAssociation for Computational Linguisticsen_US
dc.relation.isAvailableAtCenter for Advanced Study of Language
dc.relation.isAvailableAtDigitial Repository at the University of Maryland
dc.relation.isAvailableAtUniversity of Maryland (College Park, Md)
dc.subjectcomputer scienceen_US
dc.subjectstatistical methodsen_US
dc.subjectartificial intelligenceen_US
dc.subjectmachine learningen_US
dc.subjectcomputational linguisticsen_US
dc.subjectnatural language processingen_US
dc.subjecthuman language technologyen_US
dc.subjecttext processingen_US
dc.subjectactive learningen_US
dc.subjectselective samplingen_US
dc.subjectquery learningen_US
dc.subjectannotation bottlenecken_US
dc.subjectannotation costsen_US
dc.subjectsupport vector machinesen_US
dc.subjectSVMsen_US
dc.subjectcost-weighted support vector machinesen_US
dc.subjectcost-weighted SVMsen_US
dc.subjectimbalanced dataen_US
dc.subjectimbalanced datasetsen_US
dc.subjectasymmetric cost factorsen_US
dc.subjectasymmetric cost weightsen_US
dc.subjectcost-sensitive learningen_US
dc.subjectcost-sensitive active learningen_US
dc.subjectimbalanced learningen_US
dc.subjectBioNLPen_US
dc.subjectbiomedical natural language processingen_US
dc.subjectbiomedical text processingen_US
dc.subjectprotein-protein interaction extractionen_US
dc.subjectMedline text classificationen_US
dc.subjectbiomedical named entity recognitionen_US
dc.subjectbiomedical NERen_US
dc.subjectbiomedical named entity classificationen_US
dc.titleAn Approach to Reducing Annotation Costs for BioNLPen_US
dc.typeArticleen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
reducingAnnotationCostsBioNLP2008.pdf
Size:
60.85 KB
Format:
Adobe Portable Document Format