Mathematical Biosciences and Engineering (Mar 2022)
A retrieval and ranking method of mathematical documents based on CA-YOLOv5 and HFS
Abstract
In a retrieval system for mathematical documents based on mathematical expressions, the input and matching of mathematical expressions are key steps that affect the system's usability, accessibility and efficiency because of their special attributes. Therefore, this paper mainly focuses on improving the input efficiency and matching accuracy of mathematical expressions. This paper proposes a method for retrieval and ranking of mathematical documents based on CA-YOLOv5 and HFS (hesitation fuzzy set) by utilizing the advantages of CA (coordinate attention) model and YOLOv5 in target detection and the superiority of HFS in multiattribute decision-making. By embedding the CA model into the YOLOv5 network, the mathematical expressions in layout images are extracted and recognized to form mathematical query expressions. These expressions are then analyzed to obtain similarity evaluation features and matched with the candidate mathematical expressions indexed with the same features in a library of mathematical documents by employing the HFS as the similarity evaluation measure. Experiments were performed based on the TFD-ICDAR2019v2 dataset and the NTCIR dataset. The F1-score of the mathematical expression detection result was 76.54%, the MAP (mean average precision) of the mathematical documents retrieval result was 71.73%, and the average nDCG of mathematical documents ranking was 80.89%.
Keywords