Journal of King Saud University: Computer and Information Sciences (May 2025)

Learning the fine-grained code representation for log-level prediction

  • Zhiyong Zhao,
  • Guodong Fan,
  • Jing Li,
  • Ming Zhu,
  • Haotian Zhang,
  • Hongli Su

DOI
https://doi.org/10.1007/s44443-025-00064-9
Journal volume & issue
Vol. 37, no. 4
pp. 1 – 17

Abstract

Read online

Abstract Log levels are crucial to distinguish the severity of logs and directly reflecting the urgency of transactions in software systems. Automatically and efficiently determining log levels is a crucial and challenging task in log management. Current log-level automatic prediction approaches using Abstract Syntax Tree-based representation graphs do not consider the fine-grained semantics, e.g., the effects of subtle syntactic differences among similar programs and the semantics of different edges, which leads to poor accuracy in log-level prediction. To address these issues, we perform data augmentation by changing the shape of the abstract syntax tree based on code transformations without changing the semantics of the code. Meanwhile, we integrate Data Flow and Call Relationships into a code representation graph and define eight types of edges in the graph. Then, we design a multi-relational graph neural network that learns the impact of different types of edges on the log-level prediction task and learns the corresponding weights of these edges based on their types. To verify the effectiveness of our proposed approach, we conduct experiments in widely-used open-source systems. Experimental results show that our proposed approach has prominent advantages over state-of-the-art methods in predicting log levels.

Keywords