IEEE Access (Jan 2024)
Contrastive Meta-Learner for Automatic Text Labeling and Semantic Textual Similarity
Abstract
Generating large labeled datasets is a common barrier in machine learning efforts, with frequent challenges in both labeling and creating useful models for these datasets. We introduce a new approach to the automatic text labeling and semantic textual similarity tasks, which utilizes an encoder layer that is fine-tuned using triplet loss. This approach, contrastive meta-learning (CML), is specifically designed to create a naturally separable embedding space, based on minimal a priori examples. We find that through the use of CML, we are able to perform up to state-of-the-art performance on similar few-shot learning automatic labeling methodologies. For the semantic textual similarity task, CML creates a close approximation to a model trained with the full dataset, with as little as 8 training examples, whereas other common approaches require outside datasets.
Keywords