Linking cohort-based data with electronic health records: a proof-of-concept methodological study in Hong Kong
Yun Kwok Wing,
Ian C K Wong,
Esther W Chan,
Shiu Lun Au Yeung,
Gabriel M Leung,
Le Gao,
Xue Li,
Patrick Ip,
Terry Y S Lum,
Celine S L Chui,
Nirmala Rao,
Miriam T Y Leung,
Rosa S M Wong,
Edward W W Chan,
Adrienne Y L Chan,
Wilfred H S Wong,
Tatia M C Lee
Affiliations
Yun Kwok Wing
Li Ka Shing Institute of Health Sciences, Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong SAR, Guangdong, China
Ian C K Wong
Centre for Safe Medication Practice and Research, Department of Pharmacology and Pharmacy, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, Hong Kong
Esther W Chan
Department of Pharmacology and Pharmacy, The University of Hong Kong, Hong Kong, Hong Kong
Shiu Lun Au Yeung
1 School of Public Health, Li Ka Shing Faculty of Medcine, The University of Hong Kong, Hong Kong Special Administrative Region, People`s Republic of China
Gabriel M Leung
School of Public Health, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, Hong Kong
Le Gao
Centre for Safe Medication Practice and Research, Department of Pharmacology and Pharmacy, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, Hong Kong
Xue Li
2 Centre for Global Health, Usher Institute, The University of Edinburgh, Edinburgh, UK
Patrick Ip
Department of Paediatrics and Adolescent Medicine, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, Hong Kong
Terry Y S Lum
Department of Social Work and Social Administration, Faculty of Social Science, The University of Hong Kong, Hong Kong, Hong Kong
Celine S L Chui
Laboratory of Data Discovery for Health (D24H), Hong Kong Science and Technology Park, Hong Kong, Hong Kong
Nirmala Rao
Faculty of Education, The University of Hong Kong, Hong Kong, Hong Kong
Miriam T Y Leung
Centre for Safe Medication Practice and Research, Department of Pharmacology and Pharmacy, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, Hong Kong
Rosa S M Wong
Department of Paediatrics and Adolescent Medicine, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, Hong Kong
Edward W W Chan
Centre for Safe Medication Practice and Research, Department of Pharmacology and Pharmacy, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, Hong Kong
Adrienne Y L Chan
Centre for Safe Medication Practice and Research, Department of Pharmacology and Pharmacy, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, Hong Kong
Wilfred H S Wong
Department of Paediatrics and Adolescent Medicine, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, Hong Kong
Tatia M C Lee
Department of Psychology, The University of Hong Kong, Hong Kong, Hong Kong
Objectives Data linkage of cohort-based data and electronic health records (EHRs) has been practised in many countries, but in Hong Kong there is still a lack of such research. To expand the use of multisource data, we aimed to identify a feasible way of linking two cohorts with EHRs in Hong Kong.Methods Participants in the ‘Children of 1997’ birth cohort and the Chinese Early Development Instrument (CEDI) cohort were separated into several batches. The Hong Kong Identity Card Numbers (HKIDs) of each batch were then uploaded to the Hong Kong Clinical Data Analysis and Reporting System (CDARS) to retrieve EHRs. Within the same batch, each participant has a unique combination of date of birth and sex which can then be used for exact matching, as no HKID will be returned from CDARS. Raw data collected for the two cohorts were checked for the mismatched cases. After the matching, we conducted a simple descriptive analysis of attention deficit hyperactivity disorder (ADHD) information collected in the CEDI cohort via the Strengths and Weaknesses of ADHD Symptoms and Normal Behaviour Scale (SWAN) and EHRs.Results In total, 3473 and 910 HKIDs in the birth cohort and CEDI cohort were separated into 44 and 5 batches, respectively, and then submitted to the CDARS, with 100% and 97% being valid HKIDs respectively. The match rates were confirmed to be 100% and 99.75% after checking the cohort data. From our illustration using the ADHD information in the CEDI cohort, 36 (4.47%) individuals had ADHD–Combined score over the clinical cut-off in the SWAN survey, and 68 (8.31%) individuals had ADHD records in EHRs.Conclusions Using date of birth and sex as identifiable variables, we were able to link the cohort data and EHRs with high match rates. This method will assist in the generation of databases for future multidisciplinary research using both cohort data and EHRs.