STUDYING PRODUCT REVIEWS USING SENTIMENT ANALYSIS BASED ON INTERPRETABLE MACHINE LEARNING

dc.contributor.advisorDutta, Sanghamitraen_US
dc.contributor.advisorWu, Minen_US
dc.contributor.authorAtrey, Pranjalen_US
dc.contributor.departmentElectrical Engineeringen_US
dc.contributor.publisherDigital Repository at the University of Marylanden_US
dc.contributor.publisherUniversity of Maryland (College Park, Md.)en_US
dc.date.accessioned2024-02-14T06:56:29Z
dc.date.available2024-02-14T06:56:29Z
dc.date.issued2023en_US
dc.description.abstractConsumers’ reliance on product reviews and ratings has been making substantial impacts on purchasing behaviors in e-commerce. However, the relationship between reviews and ratings has received limited attention. For instance, a product may have a high rating but average reviews. Such feedback can cause confusion and uncertainty about the products, leading to decreased trust in the product. This thesis carries out a natural-language based machine learning study to analyze the relationship from e-commerce big data of product reviews and ratings. Towards answering this relationship question using natural-language-processing (NLP), we first employ data-driven sentiment analysis to obtain a numeric sentiment score from the reviews, which are then used for studying the correlation with actual ratings. For sentiment analysis, we consider the use of both glass-box (rule-based) and black-box opaque (BERT) models. We find that while the black-box model is more correlated with product ratings, there are interesting counterexamples where the sentiment analysis results by the glass-box model are better aligned with the rating. Next, we explore how well ratings can be predicted from the text reviews, and if sentiment scores can further help improve classification of reviews. We find that neither opaque nor glass- box classification models yield better accuracy, and classification accuracy mostly improves when BERT sentiment scores are augmented with reviews. Furthermore, to understand what different models use to predict ratings from reviews, we employ Local Interpretable Model- Agnostic Explanations (LIME) to explain the impact of words in reviews on the decisions of the classification models. Noting that different models can give similar predictions, which is a phenomenon known as the Rashomon Effect, our work provides insights on which words actually contribute to the decision-making of classification models, even in scenarios where an incorrect classification is made.en_US
dc.identifierhttps://doi.org/10.13016/6cvw-dlwd
dc.identifier.urihttp://hdl.handle.net/1903/31789
dc.language.isoenen_US
dc.subject.pqcontrolledComputer engineeringen_US
dc.subject.pqcontrolledComputer scienceen_US
dc.titleSTUDYING PRODUCT REVIEWS USING SENTIMENT ANALYSIS BASED ON INTERPRETABLE MACHINE LEARNINGen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Atrey_umd_0117N_23941.pdf
Size:
1.5 MB
Format:
Adobe Portable Document Format