International Journal of Information Management Data Insights (Apr 2023)

Exploring commonly used terms from online reviews in the fashion field to predict review helpfulness

  • Maryam Mahdikhani

Journal volume & issue
Vol. 3, no. 1
p. 100172

Abstract

Read online

Online shopping for fashion products is a challenging process for consumers. Although customers can facilitate purchasing, review content and helpful voting systems can be unreliable. This study aims to apply linguistic approaches on term recognition to identify and extract frequent terms in fashion reviews and predict their helpfulness. Features are chosen using the latent Dirichlet allocation (LDA) model for topics, bi-grams using the term frequency- inverse document frequency (TF-IDF) vectorizer and topics plus bi-grams using the TF-IDF vectorizer. The feature sets are then used to train four supervised algorithms on an imbalanced dataset to highlight the model performance. Models are validated using a dataset of 828,700 customer reviews collected from Amazon Fashion platform. The experimental results show that choosing LDA plus n-grams using the TF-IDF vectorizer for a random forest classifier outperforms the other models, with an accuracy of 0.81 and an F1-score of 0.78. Furthermore, the study indicates that reviews describing fabric quality, trend and fashion aesthetics, size details, price, and return experience are more helpful. Using the results, customers are made aware of how to narrow their search terms and retailers can optimize their review system more intelligently, especially on the first page of a product's description.

Keywords