Journal of Preventive Medicine and Public Health (Sep 2017)

Level of Agreement and Factors Associated With Discrepancies Between Nationwide Medical History Questionnaires and Hospital Claims Data

  • Yeon-Yong Kim,
  • Jong Heon Park,
  • Hee-Jin Kang,
  • Eun Joo Lee,
  • Seongjun Ha,
  • Soon-Ae Shin

DOI
https://doi.org/10.3961/jpmph.17.024
Journal volume & issue
Vol. 50, no. 5
pp. 294 – 302

Abstract

Read online

Objectives The objectives of this study were to investigate the agreement between medical history questionnaire data and claims data and to identify the factors that were associated with discrepancies between these data types. Methods Data from self-reported questionnaires that assessed an individual’s history of hypertension, diabetes mellitus, dyslipidemia, stroke, heart disease, and pulmonary tuberculosis were collected from a general health screening database for 2014. Data for these diseases were collected from a healthcare utilization claims database between 2009 and 2014. Overall agreement, sensitivity, specificity, and kappa values were calculated. Multiple logistic regression analysis was performed to identify factors associated with discrepancies and was adjusted for age, gender, insurance type, insurance contribution, residential area, and comorbidities. Results Agreement was highest between questionnaire data and claims data based on primary codes up to 1 year before the completion of self-reported questionnaires and was lowest for claims data based on primary and secondary codes up to 5 years before the completion of self-reported questionnaires. When comparing data based on primary codes up to 1 year before the completion of self-reported questionnaires, the overall agreement, sensitivity, specificity, and kappa values ranged from 93.2 to 98.8%, 26.2 to 84.3%, 95.7 to 99.6%, and 0.09 to 0.78, respectively. Agreement was excellent for hypertension and diabetes, fair to good for stroke and heart disease, and poor for pulmonary tuberculosis and dyslipidemia. Women, younger individuals, and employed individuals were most likely to under-report disease. Conclusions Detailed patient characteristics that had an impact on information bias were identified through the differing levels of agreement.

Keywords