International Journal of Population Data Science (Mar 2020)

Is there an agreement between self-reported medical diagnosis in the CARTaGENE cohort and the Québec administrative health databases?

  • Yves Payette,
  • Cristiano Soares de Moura,
  • Catherine Boileau,
  • Sasha Bernatsky,
  • Nolwenn Noisel

DOI
https://doi.org/10.23889/ijpds.v5i1.1155
Journal volume & issue
Vol. 5, no. 1

Abstract

Read online

Background Population health studies often use existing databases that are not necessarily constituted for research purposes. The question arises as to whether different data sources such as in administrative health data (AHD) and self-report questionnaires are equivalent and lead to similar information. Objectives The main objective of this study was to assess the level of agreement between self-reported medical conditions and medical diagnosis captured in AHD. A secondary objective was to identify predictors of agreement among medical conditions between the two data sources. Therefore, the purposes of the study were to explore the extent to which these two methods of commonly used public health data collection provide concordant records and identify the main predictors of statistical variations. Methods Data was extracted from CARTaGENE, a population-based cohort study in Québec, Canada, which was linked to the provincial health insurance records of the same individuals, namely the MED-ÉCHO database from the Régie de l’assurance maladie du Québec (RAMQ) and the fee-for-service billing records provided by the physician, for the time period 1998-2012. Agreement statistics (kappa coefficient) along with sensitivity, specificity and predictive positive value were calculated for 19 chronic conditions and 12 types of cancers. Logistic regressions were used to identify predictors of concordance between self-report and AHD from significant covariates (sex, age groups, education, region, income, heavy utilization of health care system and Charlson comorbidity index). Results Agreement between self-reported data and AHD across diseases ranged from kappa of 0.09 for chronic renal failure to 0.86 for type 2 diabetes. Sensitivity of self-reported data was higher than 50% for 14 out of the 31 medical conditions studied, especially for myocardial infarction (88.62%), breast cancer (86.28%), and diabetes (85.06%). Specificity was generally high with a minimum value of 89.70%. Lower concordance between data sources was observed for higher frequency of health care utilization and higher comorbidity scores. Discussion Overall, there was moderate agreement between the two data sources but important variations were found depending on the type of disease. This suggests that CARTaGENE’s participants were generally able to correctly identify the kind of diseases they suffer from, with some exceptions. These results may help researchers choose adequate data sources according to specific study objectives. These results also suggest that Québec’s AHD seem to underestimate the prevalence of some chronic conditions, which might result in inaccurate estimates of morbidity with consequences for public health surveillance.

Keywords