Alexandria Engineering Journal (Dec 2022)
A new generalized rayleigh distribution with analysis to big data of an online community
Abstract
Big data is a collection of complex and large volumes of data that is not easily handled by the traditional process. The top ten big data science communities include Kaggle, IBM data community, Reddit, open data science, data science central, data community DC, stack overflow, data quest, the data science society, and driven data. Among the online communities, Reddit is a promising online platform that connects millions of people to each other. It is a fruitful online platform that offers business firms to reach the maximum audience. In this paper, we introduce a new extended form of the generalized Rayleigh distribution to model the Reddit advertising and breast cancer data sets. The proposed model is called a new generalized Rayleigh distribution and possesses heavy-tailed properties. The maximum likelihood estimators along with certain mathematical properties are obtained. Finally, the new generalized Rayleigh distribution is applied to the Reddit advertising and breast cancer data sets and its comparisons are done with the other generalized forms of the Rayleigh distribution.