Quantitative Science Studies (Feb 2020)

A supervised machine learning approach to trace doctorate recipients’ employment trajectories

  • Heinisch, Dominik P.,
  • Koenig, Johannes,
  • Otto, Anne

DOI
https://doi.org/10.1162/qss_a_00001
Journal volume & issue
Vol. 1, no. 1
pp. 94 – 116

Abstract

Read online

Only scarce information is available on doctorate recipients’ career outcomes ( BuWiN, 2013 ). With the current information base, graduate students cannot make an informed decision on whether to start a doctorate or not ( Benderly, 2018 ; Blank et al., 2017 ). However, administrative labor market data, which could provide the necessary information, are incomplete in this respect. In this paper, we describe the record linkage of two data sets to close this information gap: data on doctorate recipients collected in the catalog of the German National Library (DNB), and the German labor market biographies (IEB) from the German Institute of Employment Research. We use a machine learning-based methodology, which (a) improves the record linkage of data sets without unique identifiers, and (b) evaluates the quality of the record linkage. The machine learning algorithms are trained on a synthetic training and evaluation data set. In an exemplary analysis, we compare the evolution of the employment status of female and male doctorate recipients in Germany.