International Journal of Population Data Science (Sep 2018)
Data Linkage Methods in Manitoba
Abstract
Introduction At the Manitoba Centre for Health Policy (MCHP), we have been performing data linkage for over 25 years. Over time, the Manitoba Population Research Data Repository (MPRDR) has expanded to over 80 datasets. Data linkage methods are key to bringing all this data together for population-based research. Objectives and Approach The presentation will include a detailed description of the individual steps involved in the data linkage process and provide information about the methods developed and knowledge gained over time at MCHP. We will present different scenarios linking health, education, social and justice data and the choices that are made prior to and during data linkage. The data linkage process and linkage methods, including data validation techniques, will be illustrated with examples from our work. Results The presentation will describe the different types of data we have in the MPRDR and illustrate how the data are processed in a de-identified manner so that privacy and confidentiality are maintained. The presentation will provide details on the data linkage methods used, dependent on the type of data sources being linked. This involves identifying and describing a 5-step data linkage process, including: • pre-processing (gaining knowledge about the data and cleaning/standardization techniques); • searching for and selecting the appropriate linkage variables; • applying different linkage techniques (e.g.: deterministic, probabilistic, “fuzzy matching” and manual review) to the data, • “rules” for deciding when data linkage should occur, and • reporting and Interpreting linkage outcome metrics and quality. Conclusion/Implications Our ability to link different data sources provides the capacity to study questions and complex issues related to health, social, education and justice from a population perspective. The techniques and methods described in this presentation should be applicable to other organizations linking administrative data.