大数据 (Jan 2025)
Data governance framework for artificial intelligence
Abstract
Data plays a crucial role in the development and application of artificial intelligence, which has become a consensus in industry and academia. Based on the interactive relationship between artificial intelligence and data, as well as the data-centric AI development practice, this paper proposes a data governance framework for artificial intelligence, which consists of six stages: source data governance, pre-trained data governance, evaluation data governance, fine-tuning data governance, inference data governance, and operation and maintenance data governance. Each stage has its key tasks and technologies. At the same time, this paper deeply analyzes the data governance cases and successful experiences of ChatGPT, Ziya2 and artificial intelligence models in the energy field to verify the effectiveness of the framework. The result shows that the framework plays an important role in improving the performance of artificial intelligence models and optimizing data governance processes. The framework provides reference for theoretical and practical innovation of data governance oriented to artificial intelligence.