مجله مدل سازی در مهندسی (Dec 2018)
Data Tarn: A New Approach for Management and Real-Time Analyses of Big Data
Abstract
By increasing the speed of data generation, need to process, store and analyze of Big Data becomes increasing. Related work has been done to create real-time data warehouse, but according to current unstructured data in Big Data, data warehouse with the old structure, it doesn't answer new management requirements of this type of Data. Recently, Data Lake has been proposed for unstructured data (with BASE properties). However, existence of important structured data (with ACID properties) and less sensitive unstructured big data on the other hand, causing new problems in the management of Big Data by using of this methods. In this paper we will offer a solution which is able to store structured data and unstructured data simultaneously and it can response to user’s queries in real-time. As one of the important results of this research, after comparing the data warehouse and Data Lake concluded that the lake is not a replacement for a data warehouse, and data warehouse has particular use, especially in financial data; because the data warehouse compliance ACID theory, and Data Lake cater requirements of BASE theory. The raised idea in this paper has three main advantage: 1- Simultaneous use of data warehouse and Data Lake to meet the needs of the organization data with the benefits of them. 2- Separating new data from old data to achieve real-time. 3- Development parallelism, thus synchronization loading data and query processing to reduce the cost of time.
Keywords