Baltic Journal of Economic Studies (Nov 2017)
CLASSIFICATION OF TASKS OF DATA MINING AND DATA PROCESSING IN THE ECONOMY
Abstract
The subject of the paper is methods of classification of data mining and data processing tasks in the economy, as well as their classification characteristics. Methodology. The research used the general scientific methodology of analysis, synthesis, generalization, comparison. Taxonomy methods are used in the compilation of the classification. The selection of actual economic problems is carried out by researching scientific publications on analysis and data processing. The purpose of the article is to compile an actual classification of data mining and data processing tasks in the economy and to refine the terminology of this branch of science. Results. The classification of data mining tasks is proposed, consisting of four levels. All economic tasks of data mining are divided into two large groups: predictive and descriptive. Each group is subdivided into several classes, which combine tasks with similar taxonomic features. These are the classes of tasks classification, regression, clustering, link analysis, and outlier analysis. Classes of tasks are divided into types. An important criterion for this is the dimensionality of the input data representation. It means the number of neighbours for each individual data element. The data can be presented in the form of series, matrices, and graphs. The methods of analysing each form of data presentation vary considerably. The definition of data processing is clarified and a classification of relevant tasks is proposed. At the top level of classification, the data processing tasks are divided into two groups, depending on whether the order of the elements in the input data changes or not. The main classes of data processing tasks are identified, such as ranking, sorting, filtering, cleansing, recovery, and quantization. Practical use. The results of the research can be used to model intellectual decision-support systems. They allow improving the processes of problem formulation in data mining and data processing tasks. Also, it helps in the formalization of the procedures for selecting methods of their solutions. This leads to an increase in economic efficiency.