Skip to content
University of Maryland LibrariesDigital Repository at the University of Maryland
    • Login
    View Item 
    •   DRUM
    • Center for Advanced Study of Language
    • Center for Advanced Study of Language Research Works
    • View Item
    •   DRUM
    • Center for Advanced Study of Language
    • Center for Advanced Study of Language Research Works
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Analysis of Stopping Active Learning based on Stabilizing Predictions

    Thumbnail
    View/Open
    analysisOfStoppingCoNLL2013.pdf (249.1Kb)
    No. of downloads: 435

    Date
    2013-08
    Author
    Bloodgood, Michael
    Grothendieck, John
    Citation
    Michael Bloodgood and John Grothendieck. 2013. Analysis of stopping active learning based on stabilizing predictions. In Proceedings of the Seventeenth Conference on Computational Natural Language Learning, pages 10-19, Sofia, Bulgaria, August. Association for Computational Linguistics.
    Metadata
    Show full item record
    Abstract
    Within the natural language processing (NLP) community, active learning has been widely investigated and applied in order to alleviate the annotation bottleneck faced by developers of new NLP systems and technologies. This paper presents the first theoretical analysis of stopping active learning based on stabilizing predictions (SP). The analysis has revealed three elements that are central to the success of the SP method: (1) bounds on Cohen’s Kappa agreement between successively trained models impose bounds on differences in F-measure performance of the models; (2) since the stop set does not have to be labeled, it can be made large in practice, helping to guarantee that the results transfer to previously unseen streams of examples at test/application time; and (3) good (low variance) sample estimates of Kappa between successive models can be obtained. Proofs of relationships between the level of Kappa agreement and the difference in performance between consecutive models are presented. Specifically, if the Kappa agreement between two models exceeds a threshold T (where T > 0), then the difference in F-measure performance between those models is bounded above by 4(1−T)/T in all cases. If precision of the positive conjunction of the models is assumed to be p, then the bound can be tightened to 4(1−T)/((p+1)T).
    URI
    http://hdl.handle.net/1903/15526
    Collections
    • Center for Advanced Study of Language Research Works

    DRUM is brought to you by the University of Maryland Libraries
    University of Maryland, College Park, MD 20742-7011 (301)314-1328.
    Please send us your comments.
    Web Accessibility
     

     

    Browse

    All of DRUMCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister
    Pages
    About DRUMAbout Download Statistics

    DRUM is brought to you by the University of Maryland Libraries
    University of Maryland, College Park, MD 20742-7011 (301)314-1328.
    Please send us your comments.
    Web Accessibility