Applied Sciences (Jun 2021)
LTmatch: A Method to Abstract Pattern from Unstructured Log
Abstract
Logs record valuable data from different software and systems. Execution logs are widely available and are helpful in monitoring, examination, and system understanding of complex applications. However, log files usually contain too many lines of data for a human to deal with, therefore it is important to develop methods to process logs by computers. Logs are usually unstructured, which is not conducive to automatic analysis. How to categorize logs and turn into structured data automatically is of great practical significance. In this paper, LTmatch algorithm is proposed, which implements a log pattern extracting algorithm based on a weighted word matching rate. Compared with our preview work, this algorithm not only classifies the logs according to the longest common subsequence(LCS) but also gets and updates the log template in real-time. Besides, the pattern warehouse of the algorithm uses a fixed deep tree to store the log patterns, which optimizes the matching efficiency of log pattern extraction. To verify the advantages of the algorithm, we applied the proposed algorithm to the open-source data set with different kinds of labeled log data. A variety of state-of-the-art log pattern extraction algorithms are used for comparison. The result shows our method is improved by 2.67% in average accuracy when compared with the best result in all the other methods.
Keywords