Array (Mar 2022)
Deep autoencoder-based fuzzy c-means for topic detection
Abstract
Topic detection is a process for determining topics from a collection of textual data. One of the topic detection methods is clustering based, which assumes that the centroids are topics. The clustering method has the advantage that it can process data with negative representations. Therefore, the clustering method allows a combination with a broader-representation learning method. In this paper, we adopt deep learning for topic detection by using a deep autoencoder and fuzzy c-means called “deep autoencoder-based fuzzy c-means”. The encoder of the autoencoder performs a lower-dimensional representation learning. Fuzzy c-means groups the lower-dimensional representation to identify the centroids. The autoencoder's decoder transforms the centroids back into the original representation to be interpreted as the topics. Our simulation shows that deep autoencoder-based fuzzy c-means improves the coherence score of eigenspace-based fuzzy c-means and is comparable to the leading standard methods, i.e., nonnegative matrix factorization or latent Dirichlet allocation.