Modeling Language Development: How Machine Learning can Enhance Analysis of the Language Environment

dc.contributor.advisorHuang, Yi Ting
dc.contributor.advisorNewman, Rochelle
dc.contributor.advisorDomanski, Sophie
dc.contributor.authorHarvey, James
dc.date.accessioned2024-12-20T16:04:41Z
dc.date.available2024-12-20T16:04:41Z
dc.date.issued2024-12-18
dc.description.abstractLanguage sampling elicits a representative picture of a child’s language and provides methods for assessing functional communication beyond what is offered by standardized tests. Naturalistic sampling reduces time costs, and offers an ideal way to assess differences in home language associated with differences in socioeconomic status (SES). Unfortunately, naturalistic dense recordings present challenges in terms of how to scale analysis and extract meaningful information. This study investigates the application and analysis of the Language ENvironment Analysis system (LENA) for sampling home language using technology-assisted transcription and topic modeling. To evaluate the efficacy of transcription, segments were selected in reference to their amount of meaningful speech as measured by LENA, and transcribed by Whisper, OpenAI’s automatic speech recognition software. Research assistants trimmed text files to retain available adult language separated by utterance. Results suggest that this method of sampling, technology-assisted transcription, and automated analysis of traditional language metrics reproduces expected associations between parental input, SES, and standardized child vocabulary size. Topic models did not identify activity contexts, likely due to the nature of the input. This research presents a validated pipeline to produce dense representative data that utilizes modern approaches to reduce traditional time costs.
dc.identifierhttps://doi.org/10.13016/dqfi-fjzy
dc.identifier.urihttp://hdl.handle.net/1903/33554
dc.language.isoen_US
dc.relation.isAvailableAtDepartment of Hearing & Speech Sciences
dc.relation.isAvailableAtCollege of Behavioral and Social Sciences
dc.relation.isAvailableAtDigital Repository at the University of Maryland
dc.relation.isAvailableAtUniversity of Maryland (College Park, Md)
dc.rightsAttribution-NoDerivs 3.0 United Statesen
dc.rights.urihttp://creativecommons.org/licenses/by-nd/3.0/us/
dc.titleModeling Language Development: How Machine Learning can Enhance Analysis of the Language Environment
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Honors Thesis Final.pdf
Size:
1.06 MB
Format:
Adobe Portable Document Format