MethodsX (Jan 2021)

Automated cleansing and harmonization of international trade data

  • Sandra Oliveira,
  • César Capinha,
  • Jorge Rocha

Journal volume & issue
Vol. 8
p. 101567

Abstract

Read online

Large volumes of data are becoming increasingly available and can be very valuable for the analysis of different phenomena. These data can originate from multiple sources and be recorded in diverse formats, requiring preliminary scrutiny in order to be further used in scientific analyses. This first crucial phase of filtering and cleansing data is usually a cumbersome and time-consuming task, but automated routines can be developed to help researchers. A routine created with the R language is here presented, to screen, harmonize and aggregate international trade data, representing the trade flows between countries for specific products, in a timeframe that covers monthly flows for at least 15 years for most countries. The R script implementing these routines is provided, being easily adapted to other datasets with similar issues.• A step-by-step procedure for cleansing and harmonizing international trade data, using R programming language, is presented• Automated routines are very effective in obtaining robust and filtered data inputs to integrate in scientific models• Spatial and temporal patterns of worldwide trade relations can be explored to enhance our understanding of various associated phenomena

Keywords