Jisuanji kexue yu tansuo (Sep 2024)
Survey of AIGC Large Model Evaluation: Enabling Technologies, Vulnerabilities and Mitigation
Abstract
Artificial intelligence generated content (AIGC) models have attracted widespread attention and application worldwide due to their excellent content generation capabilities. However, the rapid development of AIGC large models also brings a series of hidden dangers, such as concerns about interpretability, fairness, security, and privacy preservation of model-generated content. In order to reduce the unknowable risks and their harms, it becomes more and more important to carry out a comprehensive measurement and evaluation of AIGC large models. Academics have initiated AIGC large model evaluation studies aiming to effectively address the related challenges and avoid potential risks. This paper summarizes and analyzes the AIGC large model evaluation studies. Firstly, an overview of the model evaluation process is provided, covering model evaluation pre-preparation and corresponding measurement indicators, and existing measurement benchmarks are systematically organized. Secondly, the representative applications of the AIGC large model in finance, politics and healthcare and their problems are discussed. Then, the measurement methods are studied in depth through different perspectives, such as interpretability, fairness, robustness, security and privacy, and the new issues that need to be paid attention to AIGC large model evaluation are deconstructed, and the ways to cope with the new challenges of large model evaluation are proposed. Finally, the future challenges of AIGC large model evaluation are discussed, and its future development direction is envisioned.
Keywords