STUDYING PRODUCT REVIEWS USING SENTIMENT ANALYSIS BASED ON INTERPRETABLE MACHINE LEARNING

Atrey, Pranjal

STUDYING PRODUCT REVIEWS USING SENTIMENT ANALYSIS BASED ON INTERPRETABLE MACHINE LEARNING

dc.contributor.advisor	Dutta, Sanghamitra	en_US
dc.contributor.advisor	Wu, Min	en_US
dc.contributor.author	Atrey, Pranjal	en_US
dc.contributor.department	Electrical Engineering	en_US
dc.contributor.publisher	Digital Repository at the University of Maryland	en_US
dc.contributor.publisher	University of Maryland (College Park, Md.)	en_US
dc.date.accessioned	2024-02-14T06:56:29Z
dc.date.available	2024-02-14T06:56:29Z
dc.date.issued	2023	en_US
dc.description.abstract	Consumers’ reliance on product reviews and ratings has been making substantial impacts on purchasing behaviors in e-commerce. However, the relationship between reviews and ratings has received limited attention. For instance, a product may have a high rating but average reviews. Such feedback can cause confusion and uncertainty about the products, leading to decreased trust in the product. This thesis carries out a natural-language based machine learning study to analyze the relationship from e-commerce big data of product reviews and ratings. Towards answering this relationship question using natural-language-processing (NLP), we first employ data-driven sentiment analysis to obtain a numeric sentiment score from the reviews, which are then used for studying the correlation with actual ratings. For sentiment analysis, we consider the use of both glass-box (rule-based) and black-box opaque (BERT) models. We find that while the black-box model is more correlated with product ratings, there are interesting counterexamples where the sentiment analysis results by the glass-box model are better aligned with the rating. Next, we explore how well ratings can be predicted from the text reviews, and if sentiment scores can further help improve classification of reviews. We find that neither opaque nor glass- box classification models yield better accuracy, and classification accuracy mostly improves when BERT sentiment scores are augmented with reviews. Furthermore, to understand what different models use to predict ratings from reviews, we employ Local Interpretable Model- Agnostic Explanations (LIME) to explain the impact of words in reviews on the decisions of the classification models. Noting that different models can give similar predictions, which is a phenomenon known as the Rashomon Effect, our work provides insights on which words actually contribute to the decision-making of classification models, even in scenarios where an incorrect classification is made.	en_US
dc.identifier	https://doi.org/10.13016/6cvw-dlwd
dc.identifier.uri	http://hdl.handle.net/1903/31789
dc.language.iso	en	en_US
dc.subject.pqcontrolled	Computer engineering	en_US
dc.subject.pqcontrolled	Computer science	en_US
dc.title	STUDYING PRODUCT REVIEWS USING SENTIMENT ANALYSIS BASED ON INTERPRETABLE MACHINE LEARNING	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Atrey_umd_0117N_23941.pdf
Size:: 1.5 MB
Format:: Adobe Portable Document Format

Download

Collections

UMD Theses and Dissertations
Electrical & Computer Engineering Theses and Dissertations