Frontiers in Digital Health (Oct 2020)
A Computable Phenotype Model for Classification of Men Who Have Sex With Men Within a Large Linked Database of Laboratory, Surveillance, and Administrative Healthcare Records
Abstract
Background: Most public health datasets do not include sexual orientation measures, thereby limiting the availability of data to monitor health disparities, and evaluate tailored interventions. We therefore developed, validated, and applied a novel computable phenotype model to classify men who have sex with men (MSM) using multiple health datasets from British Columbia, Canada, 1990–2015.Methods: Three case surveillance databases, a public health laboratory database, and five administrative health databases were linked and deidentified (BC Hepatitis Testers Cohort), resulting in a retrospective cohort of 727,091 adult men. Known MSM status from the three disease case surveillance databases was used to develop a multivariable model for classifying MSM in the full cohort. Models were selected using “elastic-net” (GLMNet package) in R, and a final model optimized area under the receiver operating characteristics curve. We compared characteristics of known MSM, classified MSM, and classified heterosexual men.Findings: History of gonorrhea and syphilis diagnoses, HIV tests in the past year, history of visit to an identified gay and bisexual men's clinic, and residence in MSM-dense neighborhoods were all positively associated with being MSM. The selected model had sensitivity of 72%, specificity of 94%. Excluding those with known MSM status, a total of 85,521 men (12% of cohort) were classified as MSM.Interpretation: Computable phenotyping is a promising approach for classification of sexual minorities and investigation of health outcomes in the absence of routinely available self-report data.
Keywords