Analysis of Stopping Active Learning based on Stabilizing Predictions

Bloodgood, Michael; Grothendieck, John

Analysis of Stopping Active Learning based on Stabilizing Predictions

dc.contributor.author	Bloodgood, Michael
dc.contributor.author	Grothendieck, John
dc.date.accessioned	2014-07-14T21:10:24Z
dc.date.available	2014-07-14T21:10:24Z
dc.date.issued	2013-08
dc.description.abstract	Within the natural language processing (NLP) community, active learning has been widely investigated and applied in order to alleviate the annotation bottleneck faced by developers of new NLP systems and technologies. This paper presents the first theoretical analysis of stopping active learning based on stabilizing predictions (SP). The analysis has revealed three elements that are central to the success of the SP method: (1) bounds on Cohen’s Kappa agreement between successively trained models impose bounds on differences in F-measure performance of the models; (2) since the stop set does not have to be labeled, it can be made large in practice, helping to guarantee that the results transfer to previously unseen streams of examples at test/application time; and (3) good (low variance) sample estimates of Kappa between successive models can be obtained. Proofs of relationships between the level of Kappa agreement and the difference in performance between consecutive models are presented. Specifically, if the Kappa agreement between two models exceeds a threshold T (where T > 0), then the difference in F-measure performance between those models is bounded above by 4(1−T)/T in all cases. If precision of the positive conjunction of the models is assumed to be p, then the bound can be tightened to 4(1−T)/((p+1)T).	en_US
dc.identifier.citation	Michael Bloodgood and John Grothendieck. 2013. Analysis of stopping active learning based on stabilizing predictions. In Proceedings of the Seventeenth Conference on Computational Natural Language Learning, pages 10-19, Sofia, Bulgaria, August. Association for Computational Linguistics.	en_US
dc.identifier.uri	http://hdl.handle.net/1903/15526
dc.language.iso	en_US	en_US
dc.publisher	Association for Computational Linguistics	en_US
dc.relation.isAvailableAt	Center for Advanced Study of Language
dc.relation.isAvailableAt	Digitial Repository at the University of Maryland
dc.relation.isAvailableAt	University of Maryland (College Park, Md)
dc.subject	computer science	en_US
dc.subject	artificial intelligence	en_US
dc.subject	machine learning	en_US
dc.subject	active learning	en_US
dc.subject	selective sampling	en_US
dc.subject	query learning	en_US
dc.subject	stopping criteria	en_US
dc.subject	stopping methods	en_US
dc.subject	stabilizing predictions	en_US
dc.subject	statistical analysis	en_US
dc.subject	performance bounds	en_US
dc.subject	agreement statistics	en_US
dc.subject	agreement metrics	en_US
dc.subject	annotation bottleneck	en_US
dc.subject	Kappa statistic	en_US
dc.subject	Cohen's Kappa	en_US
dc.subject	F-measure	en_US
dc.subject	F-score	en_US
dc.subject	relationship between Kappa and F-measure	en_US
dc.subject	contingency table analysis	en_US
dc.subject	natural language processing	en_US
dc.subject	computational linguistics	en_US
dc.title	Analysis of Stopping Active Learning based on Stabilizing Predictions	en_US
dc.type	Article	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: analysisOfStoppingCoNLL2013.pdf
Size:: 249.19 KB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.57 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Center for Advanced Study of Language Research Works