Natural Language Processing Journal (Dec 2024)
Job description parsing with explainable transformer based ensemble models to extract the technical and non-technical skills
Abstract
The rapid digitization of the economy is transforming the job market, creating new roles and reshaping existing ones. As skill requirements evolve, identifying essential competencies becomes increasingly critical. This paper introduces a novel ensemble model that combines traditional and transformer-based neural networks to extract both technical and non-technical skills from job descriptions. A substantial dataset of job descriptions from reputable platforms was meticulously annotated for 22 IT roles. The model demonstrated superior performance in extracting both non-technical (67% F-score) and technical skills (72% F-score) compared to conventional CRF and hybrid deep learning models. Specifically, the proposed model outperformed these baselines by an average margin of 10% and 6%, respectively, for non-technical skills, and 29% and 6.8% for technical skills. A 5 × 2cv paired t-test confirmed the statistical significance of these improvements. In addition, to enhance model interpretability, Local Interpretable Model-Agnostic Explanations (LIME) were employed in the experiments.