IEEE Access (Jan 2024)

Ladder Curriculum Learning for Domain Generalization in Cross-Domain Classification

  • Xiaoshun Wang,
  • Sibei Luo,
  • Yiming Gao

DOI
https://doi.org/10.1109/ACCESS.2024.3425602
Journal volume & issue
Vol. 12
pp. 95356 – 95367

Abstract

Read online

Domain generalization seeks to acquire a domain-invariant representation from various source domains, thereby enabling a model to achieve robust generalization across previously unseen target domains. Most existing domain generalization methods for cross-domain classification tasks typically train models using examples randomly presented from all source domains. This may lead to training instability due to the presence of conflicting gradients, thus affecting the model’s generalization ability. Recently, curriculum learning has been successfully applied in domain generalization. However, we find that existing methods only focus on domain shift and ignore intra-domain category shift, which still leads to gradient conflict problems and affects the model’s generalization ability. To address the aforementioned challenges, we put forward a novel and general methodology known as ladder curriculum learning (LCL) as a solution to the above problem. Specifically, we deliver the source domain data in stages according to the order from easy to difficult. We focus not only on the inter-domain data sorted from easy to difficult, known as inter-domain curriculum learning, but also on the intra-domain data sorted from easy to difficult, known as intra-domain curriculum learning. Through the combined effects of inter-domain curriculum learning and intra-domain curriculum learning, our proposed LCL method can effectively address the optimization problem concerning conflicting gradient directions. Experiments conducted on widely used public datasets show that the LCL method can significantly improve baseline methods, with an improvement margin of up to 1.5%. Through experiments, we also find that the LCL method can be successfully applied to existing domain generalization methods, further enhancing the network’s generalization capability with an average improvement rate of 1%.

Keywords