GMS Medizinische Informatik, Biometrie und Epidemiologie (Jun 2006)

Personal identifiers in medical research networks: evaluation of the personal identifier generator in the Competence Network Paediatric Oncology and Haematology

  • Pommerening, Klaus,
  • Herold, Ralf,
  • Glock, Jutta

Journal volume & issue
Vol. 2, no. 2
p. Doc06

Abstract

Read online

The Society for Paediatric Oncology and Haematology (GPOH) and the corresponding Competence Network Paediatric Oncology and Haematology conduct various clinical trials. The comprehensive analysis requires reliable identification of the recruited patients. Therefore, a personal identifier (PID) generator is used to assign unambiguous, pseudonymous, non-reversible PIDs to participants in those trials. We tested the matching algorithm of the PID generator using a configuration specific to the GPOH. False data was used to verify the correct processing of PID requests (functionality tests), while test data was used to evaluate the matching outcome. We also assigned PIDs to more than 44,000 data records from the German Childhood Cancer Registry (GCCR) and assessed the status of the associated patient list which contains the PIDs, partly encrypted data items and information on the PID generation process for each data record. All the functionality tests showed the expected results. Neither 14,915 test data records nor the GCCR data records yielded any homonyms. Six synonyms were found in the test data, due to erroneous birth dates, and 22 synonyms were found when the GCCR data was run against the actual patient list of 2579 records. In the resulting patient list of 45,693 entries, duplicate record submissions were found for about 7% of all listed patients, while more frequent submissions occurred in less than 1% of cases. The synonym error rate depends mainly on the quality of the input data and on the frequency of multiple submissions. Depending on the requirements on maximally tolerable synonym and homonym error rates, additional measures for securing input data quality might be necessary. The results demonstrate that the PID generator is an appropriate tool for reliably identifying trial participants in medical research networks.

Keywords