Efficient Sentence Representation Learning via Knowledge Distillation with Maximum Coding Rate Reduction

Domagoj Ševerdija; Tomislav Prusina; Luka Borozan; Domagoj Matijević

doi:10.20532/cit.2023.1005673

Journal of Computing and Information Technology (Jan 2023)

Efficient Sentence Representation Learning via Knowledge Distillation with Maximum Coding Rate Reduction

Domagoj Ševerdija,
Tomislav Prusina,
Luka Borozan,
Domagoj Matijević

Affiliations

Domagoj Ševerdija: School of Applied Mathematics and Computer Science, University of Osijek, Croatia
Tomislav Prusina: Universität Hamburg, Department of Informatics, Germany
Luka Borozan: School of Applied Mathematics and Computer Science, University of Osijek, Croatia
Domagoj Matijević: School of Applied Mathematics and Computer Science, University of Osijek, Croatia

DOI: https://doi.org/10.20532/cit.2023.1005673
Journal volume & issue: Vol. 31, no. 4
pp. 251 – 266

Abstract

Read online

Addressing the demand for effective sentence representation in natural language inference problems, this paper explores the utility of pre-trained large language models in computing such representations. Although these models generate high-dimensional sentence embeddings, a noticeable performance disparity arises when they are compared to smaller models. The hardware limitations concerning space and time necessitate the use of smaller, distilled versions of large language models. In this study, we investigate the knowledge distillation of Sentence-BERT, a sentence representation model, by introducing an additional projection layer trained on the novel Maximum Coding Rate Reduction (MCR2) objective designed for general-purpose manifold clustering. Our experiments demonstrate that the distilled language model, with reduced complexity and sentence embedding size, can achieve comparable results on semantic retrieval benchmarks, providing a promising solution for practical applications.

Published in Journal of Computing and Information Technology

ISSN: 1846-3908 (Online)
Publisher: University of Zagreb Faculty of Electrical Engineering and Computing
Country of publisher: Croatia
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: http://cit.fer.hr/index.php/CIT/index

About the journal

Abstract

Keywords