Informatică economică (Jan 2012)

Distributed Parallel Architecture for "Big Data"

  • Catalin BOJA,
  • Adrian POCOVNICU,
  • Lorena BATAGAN

Journal volume & issue
Vol. 16, no. 2
pp. 116 – 127

Abstract

Read online

This paper is an extension to the "Distributed Parallel Architecture for Storing and Processing Large Datasets" paper presented at the WSEAS SEPADS’12 conference in Cambridge. In its original version the paper went over the benefits of using a distributed parallel architecture to store and process large datasets. This paper analyzes the problem of storing, processing and retrieving meaningful insight from petabytes of data. It provides a survey on current distributed and parallel data processing technologies and, based on them, will propose an architecture that can be used to solve the analyzed problem. In this version there is more emphasis put on distributed files systems and the ETL processes involved in a distributed environment.

Keywords