Journal of Traffic and Transportation Engineering (English ed. Online) (Oct 2023)

An overview of Hadoop applications in transportation big data

  • Changxi Ma,
  • Mingxi Zhao,
  • Yongpeng Zhao

Journal volume & issue
Vol. 10, no. 5
pp. 900 – 917

Abstract

Read online

As an open-source cloud computing platform, Hadoop is extensively employed in a variety of sectors because of its high dependability, high scalability, and considerable benefits in processing and analyzing massive amounts of data. Consequently, to derive valuable insights from transportation big data, it is essential to leverage the Hadoop big data platform for analysis and mining. To summarize the latest research progress on the application of Hadoop to transportation big data, we conducted a comprehensive review of 98 relevant articles published from 2012 to the present. Firstly, a bibliometric analysis was performed using VOSviewer software to identify the evolution trend of keywords. Secondly, we introduced the core components of Hadoop. Subsequently, we systematically reviewed the 98 articles, identified the latest research progress, and classified the main application scenarios of Hadoop and its optimization framework. Based on our analysis, we identified the research gaps and future work in this area. Our review of the available research highlights that Hadoop has played a significant role in transportation big data research over the past decade. Specifically, the focus has been on transportation infrastructure monitoring, taxi operation management, travel feature analysis, traffic flow prediction, transportation big data analysis platform, traffic event monitoring and status discrimination, license plate recognition, and the shortest path. Additionally, the optimization framework of Hadoop has been studied in two main areas: the optimization of the computational model of Hadoop and the optimization of Hadoop combined with Spark. Several research results have been achieved in the field of transportation big data. However, there is less systematic research on the core technology of Hadoop, and the breadth and depth of the integration development of Hadoop and transportation big data are not sufficient. In the future, it is suggested that Hadoop may be combined with other big data frameworks such as Storm and Flink that process real-time data sources to improve the real-time processing and analysis of transportation big data. Simultaneously, the research on multi-source heterogeneous transportation big data is still a key focus. Improving existing big data technology to enable the analysis and even data compression of transportation big data can lead to new breakthroughs for intelligent transportation.

Keywords