International Journal of Crowd Science (Nov 2018)
Anomaly data management and big data analytics: an application on disability datasets
Abstract
Purpose - The disability datasets are the datasets that contain the information of disabled populations. By analyzing these datasets, professionals who work with disabled populations can have a better understanding of the inherent characteristics of the disabled populations, so that working plans and policies, which can effectively help the disabled populations, can be made accordingly. Design/methodology/approach - In this paper, the authors proposed a big data management and analytic approach for disability datasets. Findings - By using a set of data mining algorithms, the proposed approach can provide the following services. The data management scheme in the approach can improve the quality of disability data by estimating miss attribute values and detecting anomaly and low-quality data instances. The data mining scheme in the approach can explore useful patterns which reflect the correlation, association and interactional between the disability data attributes. Experiments based on real-world dataset are conducted at the end to prove the effectiveness of the approach. Originality/value - The proposed approach can enable data-driven decision-making for professionals who work with disabled populations.
Keywords