IEEE Access (Jan 2025)

Integrating Travel Survey and Amap API Data Into Travel Mode Choice Analysis With Interpretable Machine Learning Models: A Case Study in China

  • Li Tang,
  • Xuan Lin,
  • Jingcai Yu,
  • Chuanli Tang

DOI
https://doi.org/10.1109/ACCESS.2025.3540082
Journal volume & issue
Vol. 13
pp. 27852 – 27867

Abstract

Read online

Travel survey data has long been one of the fundamental sources for travel mode choice behavior analysis. It can provide a wealth of travel-related information and detailed household attributes of individuals. However, precise travel path data, especially data relating to choices is usually lacking. Online Amap API service is promising to compensate for this lack by providing tools to calculate routes, and display key information such as travel time and cost. This study integrated revealed choice survey data with path data from an online Amap API, creating a multi-source combined dataset. Furthermore, two machine learning models (extreme gradient boosting and random forest) were applied for travel mode predictive analysis based on the original travel survey data and the integrated data. Results exhibit that the overall prediction accuracy of the XGB model improves from 0.82 to 0.84, the RF model from 0.78 to 0.82, demonstrating that integrating Amap API data can improve the performance of models is improved after Amap API data is integrated. Additionally, the Shapley additive explanations analysis (SHAP) method was applied for model interpretation. The results identify travel cost, travel time, age, and monthly income as the key factors. These findings highlight the importance of fusing big data techniques with small data in enhancing predictive accuracy and provide valuable insights into travel behavior studies.

Keywords