IEEE Access (Jan 2022)
PeerRank: Robust Learning to Rank With Peer Loss Over Noisy Labels
Abstract
User-generated data are extensively utilized in learning to rank as they are easy to collect and up-to-date. However, the data inevitably contain noisy labels attributed to users’ annotation mistakes, lack of domain knowledge, system failure, etc., making building a robust model challenging. On account of the remarkable nature of deep neural networks in fitting datasets, the noisy labels significantly degrade the performance of learning-to-rank algorithms. To cope with this problem, previous studies have put forward several methods for label de-noising. However, they are either susceptible to the noise distribution on datasets, raising the demand for clean data or incurring more computational costs. Moreover, most of them are tough to extend to different scenarios. This paper proposes a simple yet effective framework named PeerRank that can be applied in broad applications such as click-through rate prediction and commercial web search in learning-to-rank tasks. PeerRank is a robust, effective, and adaptable framework that can couple with numerous models with theoretical guarantees. Extensive experiments on three public real-world datasets with thirteen point-wise base models and four semi-synthetic generation datasets with four pair-wise base models show the consistent improvement of PeerRank. The results comparing PeerRank with seven classic and state-of-the-art de-noising methods validate the advantages of PeerRank framework for learning to rank over noisy labels.
Keywords