Floating-Point Embedding: Enhancing the Mathematical Comprehension of Large Language Models

Xiaoxiao Jin; Chenyang Mao; Dengfeng Yue; Tuo Leng

doi:10.3390/sym16040478

Symmetry (Apr 2024)

Floating-Point Embedding: Enhancing the Mathematical Comprehension of Large Language Models

Xiaoxiao Jin,
Chenyang Mao,
Dengfeng Yue,
Tuo Leng

Affiliations

Xiaoxiao Jin: Department of Computer Engineering and Science, Shanghai University, Shanghai 200444, China
Chenyang Mao: Department of Computer Engineering and Science, Shanghai University, Shanghai 200444, China
Dengfeng Yue: Department of Computer Engineering and Science, Shanghai University, Shanghai 200444, China
Tuo Leng: Department of Computer Engineering and Science, Shanghai University, Shanghai 200444, China

DOI: https://doi.org/10.3390/sym16040478
Journal volume & issue: Vol. 16, no. 4
p. 478

Abstract

Read online

The processing and comprehension of numerical information in natural language represent pivotal focal points of scholarly inquiry. Across diverse applications spanning text analysis to information retrieval, the adept management and understanding of the numerical content within natural language are indispensable in achieving task success. Specialized encoding and embedding techniques tailored to numerical data offer an avenue toward improved performance in tasks involving masked prediction and numerical reasoning, inherently characterized by numerical values. Consequently, treating numbers in text merely as words is inadequate; their numerical semantics must be underscored. Recent years have witnessed the emergence of a range of specific encoding methodologies designed explicitly for numerical content, demonstrating promising outcomes. We observe similarities between the Transformer architecture and CPU architecture, with symmetry playing a crucial role. In light of this observation and drawing inspiration from computer system theory, we introduce a floating-point representation and devise a corresponding embedding module. The numerical representations correspond one-to-one with their semantic vector values, rendering both symmetric regarding intermediate transformation methods. Our proposed methodology facilitates the more comprehensive encoding and embedding of numerical information within a predefined precision range, thereby ensuring a distinctive encoding representation for each numerical entity. Rigorous testing on multiple encoder-only models and datasets yielded results that stand out in terms of competitiveness. In comparison to the default embedding methods employed by models, our approach achieved an improvement of approximately 3.8% in Top-1 accuracy and a reduction in perplexity of approximately 0.43. These outcomes affirm the efficacy of our proposed method. Furthermore, the enrichment of numerical semantics through a more comprehensive embedding contributes to the augmentation of the model’s capacity for semantic understanding.

Published in Symmetry

ISSN: 2073-8994 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Mathematics
Website: http://www.mdpi.com/journal/symmetry/

About the journal

Abstract

Keywords