Complexity (Jan 2021)

Research on the Key Issues of Big Data Quality Management, Evaluation, and Testing for Automotive Application Scenarios

  • Yingzi Wang,
  • Ce Yu,
  • Jue Hou,
  • Yongjia Zhang,
  • Xiangyi Fang,
  • Shuyue Wu

DOI
https://doi.org/10.1155/2021/9996011
Journal volume & issue
Vol. 2021

Abstract

Read online

This paper provides an in-depth analysis and discussion of the key issues of quality management, evaluation, and detection contained in big data for automotive application scenarios. A generalized big data quality management model and programming framework are proposed, and a series of data quality detection and repair interfaces are built to express the processing semantics of various data quality issues. Through this data quality management model and detection and repair interfaces, users can quickly build custom data quality detection and repair tasks for different data quality requirements. To improve the operational efficiency of complex data quality management algorithms in large-scale data scenarios, corresponding parallelization algorithms are studied and implemented for detection and repair algorithms with long computation time, including priority-based multiconditional function-dependent detection and repair algorithms, entity detection, and extraction algorithms based on semantic information and chunking techniques, and plain Bayesian-based missing value filling algorithms, and this paper proposes a data validity evaluation algorithm and enhances the validity of the original data in practical applications by adding temporal weights, and finally it passed the experimental validation. Through the comprehensive detection process of data importance, network busyness, duration of transmission process, and failure situation, the efficiency has been increased by 20%, and an adaptive data integrity detection method based on random algorithm and encryption algorithm is designed. After experimental verification, this method can effectively detect the integrity of the data transmission process and improve the application of data value, and the final effect is increased by 30.5%.