Jisuanji kexue (Dec 2022)

Study on Anomaly Detection and Real-time Reliability Evaluation of Complex Component System Based on Log of Cloud Platform

  • WANG Bo, HUA Qing-yi, SHU Xin-feng

DOI
https://doi.org/10.11896/jsjkx.220200106
Journal volume & issue
Vol. 49, no. 12
pp. 125 – 135

Abstract

Read online

Reliability,usability and security are three important indicators of software quality measurement,and software reliability is the most important indicator.Software system is regarded as a whole or viewed invocation structure of software as static structure in traditional software reliability evaluation and prediction.Today’s software architecture has changed significantly.Typical features such as autonomy,coordination,evolution,dynamic and adaptive have been infiltrated into the current complex network software system.Traditional reliability evaluation and prediction methods cannot adapt to such software architecture or environment.Currently,in the society of high-speed information,“software defines everything”.Massive information systems ge-nerate large-scale data resources.The diversity and complexity of log resources are the results of heterogeneity,parallelism,complexity and huge scale of modern information systems.Accurate analysis and anomaly prediction based on logs are particularly important for building safe and reliable systems.There are a lot of research on anomaly prediction and software reliability in the existing literatures,but there is little about real-time software reliability measurement for massive logs and complex network component systems.Accordingly,based on the complete procedures of log processing,from its analysis,feature extraction,anomaly detection and prediction evaluation to real-time reliability evaluation,this paper uses ensemble learning model to analyze and predict anomaly of the massive system logs.Comparisons with the traditional machine learning methods are made to improve the accuracy,recall rate and F1 value of anomaly prediction.The evaluation result is used to correct the real-time reliability in view of the low predicted recall rate,which greatly improves the accuracy of real-time reliability.According to the individual reliability,the system reliability based on Markov theory is used to measure the reliability of microservice composite components,so as to provide accurate data basis and anomaly location basis for intelligent operation and maintenance.

Keywords