Epidemiology and Health (Nov 2021)
Gender differences in under-reporting hiring discrimination in Korea: a machine learning approach
Abstract
OBJECTIVES This study was conducted to examine gender differences in under-reporting hiring discrimination by building a prediction model for workers who responded “not applicable (NA)” to a question about hiring discrimination despite being eligible to answer. METHODS Using data from 3,576 wage workers in the seventh wave (2004) of the Korea Labor and Income Panel Study, we trained and tested 9 machine learning algorithms using “yes” or “no” responses regarding the lifetime experience of hiring discrimination. We then applied the best-performing model to estimate the prevalence of experiencing hiring discrimination among those who answered “NA.” Under-reporting of hiring discrimination was calculated by comparing the prevalence of hiring discrimination between the “yes” or “no” group and the “NA” group. RESULTS Based on the predictions from the random forest model, we found that 58.8% of the “NA” group were predicted to have experienced hiring discrimination, while 19.7% of the “yes” or “no” group reported hiring discrimination. Among the “NA” group, the predicted prevalence of hiring discrimination for men and women was 45.3% and 84.8%, respectively. CONCLUSIONS This study introduces a methodological strategy for epidemiologic studies to address the under-reporting of discrimination by applying machine learning algorithms.
Keywords