BMJ Open (Jul 2019)
Implementation of an algorithm for the identification of breast cancer deaths in German health insurance claims data: a validation study based on a record linkage with administrative mortality data
Abstract
Objective To adapt a Canadian algorithm for the identification of female cases of breast cancer (BC) deaths to German health insurance claims data and to test and validate the algorithm by comparing results with official cause of death (CoD) data on the individual and the population level.Design Validation study, secondary data, medical claims.Setting Claims data of two statutory health insurance providers (SHIs) for inpatient and outpatient care, CoD added via record linkage with epidemiological cancer registry (ECR).ParticipantsAll women insured with the two SHIs and who deceased in the period 2006–2013, were residents of North Rhine Westphalia (NRW) and were linked with ECR data: n=22 413.Main outcome measures Based on inpatient and outpatient diagnoses in the year before death, six algorithms were derived and the accordance of the algorithm-based CoD with the official CoD was evaluated calculating specificity, sensitivity, negative and positive predictive values (NPV, PPV). Furthermore, algorithm-based age-specific BC mortality rates covering several calendar years were calculated for the entire insured female population and compared with official national rates.Results Our final algorithm, derived from the NRW subsample, comprised codes indicating the presence of BC, metastases, a terminal illness phase and the absence of codes for other tumours. Overall, specificity, sensitivity, NPV and PPV of this algorithm were 97.4%, 91.3%, 98.9% and 81.7%, respectively. In the age range 40–80 years, sensitivity and PPV slightly decreased with increasing age. Algorithm-based age-specific BC mortality rates agreed well with official rates except for the age group 85 years and older.Conclusions The algorithm-based identification of BC deaths in German claims data is feasible and valid, except for higher ages. The algorithm to ascertain BC mortality rates in an epidemiological study seems applicable when information on the official CoD is not available in the original database.