Knowledge (Feb 2024)

Web Mining of Online Resources for German Labor Market Research and Education: Finding the Ground Truth?

  • Andreas Fischer,
  • Jens Dörpinghaus

DOI
https://doi.org/10.3390/knowledge4010003
Journal volume & issue
Vol. 4, no. 1
pp. 51 – 67

Abstract

Read online

The labor market is highly dependent on vocational and academic education, training, retraining, and further education in order to master challenges such as advancing digitalization and sustainability. Further training is a key factor in ensuring a qualified workforce, the employability of all employees, and, thus, national competitiveness and innovation. In the contribution at hand, we explore an innovative way to derive knowledge about learning pathways by connecting the dots from different data sources of the German labor market. In particular, we focus on the web mining of online resources for German labor market research and education, such as online advertisements, information portals, and official government websites. A key question for working with different data sources is how to find the ground truth and common data structures that can be used to make the data interoperable. We discuss how to classify and summarize web data from different platforms and which methods can be used for extracting data, entities and relationships from online resources on the German labor market to build a network of educational pathways. Our proposed solution is based on the classification of occupations (KldB) and related document codes (DKZ), and combines natural language processing and knowledge graph technologies. Our research provides the foundation for further investigation into educational pathways and linked data for labor market research. While our work focuses on German data, it is also useful for other German-speaking countries and could easily be extended to other languages such as English.

Keywords