International Journal of Transportation Science and Technology (Sep 2022)

Truck industry classification from anonymous mobile sensor data using machine learning

  • Taslima Akter,
  • Sarah Hernandez

Journal volume & issue
Vol. 11, no. 3
pp. 522 – 535

Abstract

Read online

Freight demand forecasting models used by federal, state, and local transportation agencies to predict future freight flows are often based on economic forecasts of industry growth and/or commodity production/consumption rates which are then used to estimate expected freight movement, e.g., truck volumes. Unfortunately, there is a lack of data connecting industry served and commodity carried to freight movements which limits the accuracy and usability of such models. While the private sector collects robust data on freight movements including commodity carried, when shared with the public sector, this data is anonymized to protect privacy. Thus, there is a critical need to re-identify industry served and commodity carried from anonymous freight movement data in ways that maintain privacy standards. To address this research gap, we developed a classification model using data mining and machine learning methods to predict industry served by a truck from daily activity patterns of trip and stop sequences extracted from truck Global Positioning System (GPS) data. A Weighted Random Forest (WRF) supervised machine learning model is used to predict five industry groups: farm products, mining materials, chemicals, manufactured goods, and miscellaneous mixed goods. The WRF model achieves 88% prediction accuracy and explicitly accounts for class imbalance. The model does not reveal fleet, driver, company, etc. and thus, we are able to provide necessary insight into the relationship between truck movement and economic (industry) forecasts without violating data privacy standards. Ultimately, our model allows large streams of truck movement data to be leveraged for freight travel demand forecasting.

Keywords