GMS Medizinische Informatik, Biometrie und Epidemiologie (Jul 2019)
Indicators of data quality: review and requirements from the perspective of networked medical research
Abstract
Data quality is of highest importance for quantitative medical research. A common set of indicators for data quality is needed to cope with the future challenges in data management for biomedical informatics. A guideline for adaptive data management was developed in 2006, which offers indicators for data quality organized in three categories: integrity, organization, and trueness. The guideline was revised in 2014 bottom-up by extending its content with standards from a cancer registry, a cohort, and a data repository in Germany. In parallel, a systematic literature review identified indicators of data quality published in the literature since 2005 using Medline as literature database. The guideline differentiates in its second version 51 indicators (integrity: 30, organization5, trueness: 6). The literature review identified 34 indicators in rticles. A lack of indicators in the literature addressing the organizational aspects of data sets became visible comparing both sets. Furthermore, indicators useful for data sets used in health care practice, such as timeliness, were missing in the guideline’s set. The comparison is a first step towards a common set of indicators. Beyond a consented denomination of the indicators, this set should offer an operational definition that supports a reliable application from different parties to different data sets. Furthermore, a systematic organization of the indi would foster an appropriate selection of the individual indicators according to specific use cases.
Keywords