Pulmonary Circulation (Jan 2024)
Clinical evaluation of code‐based algorithms to identify patients with pulmonary arterial hypertension in healthcare databases
Abstract
Abstract Pulmonary arterial hypertension (PAH) is a rare subgroup of pulmonary hypertension (PH). Claims and administrative databases can be particularly important for research in rare diseases; however, there is a lack of validated algorithms to identify PAH patients using administrative codes. We aimed to measure the accuracy of code‐based PAH algorithms against the true clinical diagnosis by right heart catheterization (RHC). This study evaluated algorithms in patients who were recorded in two linkable data assets: the Stanford Healthcare administrative electronic health record database and the Stanford Vera Moulton Wall Center clinical PH database (which records each patient's RHC diagnosis). We assessed the sensitivity and specificity achieved by 16 algorithms (six published). In total, 720 PH patients with linked data available were included and 558 (78%) of these were PAH patients. Algorithms consisting solely of a P(A)H‐specific diagnostic code classed all or almost all PH patients as PAH (sensitivity >97%, specificity <12%) while multicomponent algorithms with well‐defined temporal sequences of procedure, diagnosis and treatment codes achieved a better balance of sensitivity and specificity. Specificity increased and sensitivity decreased with increasing algorithm complexity. The best‐performing algorithms, in terms of fewest misclassified patients, included multiple components (e.g., PH diagnosis, PAH treatment, continuous enrollment for ≥6 months before and ≥12 months following index date) and achieved sensitivities and specificities of around 95% and 38%, respectively. Our findings help researchers tailor their choice and design of code‐based PAH algorithms to their research question and demonstrate the importance of including well‐defined temporal components in the algorithms.
Keywords