International Journal of Population Data Science (Nov 2019)

Achieving quality primary care data: a description of the Canadian Primary Care Sentinel Surveillance Network data capture, extraction, and processing in Alberta

  • Stephanie Garies,
  • Michael Cummings,
  • Brian Forst,
  • Kerry McBrien,
  • Boglarka Soos,
  • Matt Taylor,
  • Neil Drummond,
  • Donna Manca,
  • Kimberley Duerksen,
  • Hude Quan,
  • Tyler Williamson

DOI
https://doi.org/10.23889/ijpds.v4i2.1132
Journal volume & issue
Vol. 4, no. 2

Abstract

Read online

Electronic medical record (EMR) databases have become increasingly popular for secondary purposes, such as health research. The Canadian Primary Care Sentinel Surveillance Network (CPCSSN) is the country’s first and only national primary care EMR data repository, with de-identified health information for almost two million Canadians. Comprehensive and freely available documentation describing the data ‘lifecycle’ is important for assessing potential data quality issues and appropriate interpretation of research findings. Here, we describe the flow of CPCSSN data in the province of Alberta. The data originate from 54 publicly-funded primary care settings, including one community pediatric clinic, with 318 providers contributing de-identified EMR data for 410,951 patients. Data extraction methods have been developed for five different EMR systems, and include both backend and automated frontend extractions. The raw EMR data are transformed according to specific rules, including trimming implausible values, converting values and free text to standard terminologies or classification systems, and structuring the data into a common CPCSSN format. Regional networks across Canada are responsible for their local data extraction and processing, before the data are transferred to a central repository, and made available for research and disease surveillance. This paper aims to provide important contextual information to future CPCSSN data users.