IEEE Access (Jan 2023)
MapReduce for Graphs Processing: New Big Data Algorithm for 2-Edge Connected Components and Future Ideas
Abstract
Finding connectivity in graphs has numerous applications, such as social network analysis, data mining, intra-city or inter-cities connectivity, neural network, and many more. The deluge of graph applications makes graph connectivity problems extremely important and worthwhile to explore. Currently, there are many single-node algorithms for graph mining and analysis; however, those algorithms primarily apply to small graphs and are implemented on a single machine node. Finding 2-edge connected components (2-ECCs) in massive graphs (billions of edges and vertices) is impractical and time-consuming, even with the best-known single-node algorithms. Processing a big graph in a parallel and distributed fashion saves considerable time to finish processing. Moreover, it enables stream data processing by allowing quick results for vast and continuous nature data sets. This research proposes a distributed and parallel algorithm for finding 2-ECCs in big undirected graphs (subsequently called “ $BiECCA$ ”) and presents its time complexity analysis. The proposed algorithm is implemented on a MapReduce framework and uses an existing algorithm to find connected components (CCs) in a graph as a sub-step. Finally, we suggest a few novel ideas and approaches as extensions to our work.
Keywords