IEEE Access (Jan 2024)
Text Regression Analysis: A Review, Empirical, and Experimental Insights
Abstract
Effective management and analysis of large-scale textual data presents significant challenges, notably due to high storage and processing demands. Text regression analysis, a specific branch of text mining, has proven invaluable in enabling individuals, researchers, and businesses to derive meaningful insights from the rapidly increasing volumes of textual data. However, the general grouping of algorithms in existing surveys often leads to confusion and imprecise evaluations. This paper addresses these issues by introducing a methodological taxonomy tailored for text regression analysis. This taxonomy categorizes algorithms into specific techniques and detailed categories, organized into two levels: methodology category and methodology technique. To validate the accuracy of the different techniques and categories, we conduct both empirical and experimental evaluations of text regression techniques. Empirical evaluations are based on four criteria, while experimental evaluations include rankings of: (1) algorithms employing identical techniques, (2) various techniques within the same category, and (3) contrasting categories overall. This dual approach of methodological structuring and rigorous evaluation offers a nuanced and comprehensive understanding of text regression algorithms, equipping researchers with the knowledge to make informed decisions.
Keywords