A Human-centric Approach to NLP in Healthcare Applications
Files
Publication or External Link
Date
Authors
Citation
DRUM DOI
Abstract
The abundance of personal health information available to healthcare professionals can be a facilitator to better care. However, it can also be a barrier, as the relevant information is often buried in the sheer amount of personal data, and healthcare professionals already lack time to take care of both patients and their data. This dissertation focuses on the role of natural language processing (NLP) in healthcare and how it can surface information relevant to healthcare professionals by modeling the extensive collections of documents that describe those whom they serve.
In this dissertation, the extensive natural language data about a person is modeled as a set of documents, where the model inference is at the level of the individual, but evidence supporting that inference is found in a subset of their documents. The effectiveness of this modeling approach is demonstrated in the context of three healthcare applications. In the first application, clinical coding, document-level attention is used to model the hierarchy between a clinical encounter and its documents, jointly learning the encounter labels and the assignment of credits to specific documents. The second application, suicidality assessment using social media, further investigates how document-level attention can surface "high-signal" posts from the document set representing a potentially at-risk individual. Finally, the third application aims to help healthcare professionals write discharge summaries using an extract-then-abstract multidocument summarization pipeline to surface relevant information.
As in many healthcare applications, these three applications seek to assist, not replace, clinicians. Evaluation and model design thus centers around healthcare professionals' needs. In clinical coding, document-level attention is shown to align well with professional clinical coders' expectations of evidence. In suicidality assessment, document-level attention leads to better and more time-efficient assessment by surfacing document-level evidence, shown empirically using a theoretically grounded time-aware evaluation measure and a dataset annotated by suicidality experts. Finally, extract-then-abstract summarization pipelines that assist healthcare professionals in writing discharge summaries are evaluated by their ability to surface faithful and relevant evidence.