Pragmatic and Observational Research (Mar 2024)
Using Claims Data to Predict Pre-Operative BMI Among Bariatric Surgery Patients: Development of the BMI Before Bariatric Surgery Scoring System (B3S3)
Abstract
Jenna Wong,1– 3 Xiaojuan Li,1,2 David E Arterburn,4 Dongdong Li,1,2 Elizabeth Messenger-Jones,1 Rui Wang,1,2 Sengwee Toh1,2 1Department of Population Medicine, Harvard Pilgrim Health Care Institute, Boston, MA, USA; 2Department of Population Medicine, Harvard Medical School, Boston, MA, USA; 3Optum Labs Visiting Fellow, Eden Prairie, MN, USA; 4Kaiser Permanente Washington Health Research Institute, Seattle, WA, USACorrespondence: Jenna Wong, Department of Population Medicine, Harvard Medical School & Harvard Pilgrim Health Care Institute, 401 Park Drive, Suite 401 East, Boston, MA, 02215, USA, Tel +1 617 867-4513, Fax +1 617 867-427, Email [email protected]: Lack of body mass index (BMI) measurements limits the utility of claims data for bariatric surgery research, but pre-operative BMI may be imputed due to existence of weight-related diagnosis codes and BMI-related reimbursement requirements. We used a machine learning pipeline to create a claims-based scoring system to predict pre-operative BMI, as documented in the electronic health record (EHR), among patients undergoing a new bariatric surgery.Methods: Using the Optum Labs Data Warehouse, containing linked de-identified claims and EHR data for commercial or Medicare Advantage enrollees, we identified adults undergoing a new bariatric surgery between January 2011 and June 2018 with a BMI measurement in linked EHR data ≤ 30 days before the index surgery (n=3226). We constructed predictors from claims data and applied a machine learning pipeline to create a scoring system for pre-operative BMI, the B3S3. We evaluated the B3S3 and a simple linear regression model (benchmark) in test patients whose index surgery occurred concurrent (2011– 2017) or prospective (2018) to the training data.Results: The machine learning pipeline yielded a final scoring system that included weight-related diagnosis codes, age, and number of days hospitalized and distinct drugs dispensed in the past 6 months. In concurrent test data, the B3S3 had excellent performance (R2 0.862, 95% confidence interval [CI] 0.815– 0.898) and calibration. The benchmark algorithm had good performance (R2 0.750, 95% CI 0.686– 0.799) and calibration but both aspects were inferior to the B3S3. Findings in prospective test data were similar.Conclusion: The B3S3 is an accessible tool that researchers can use with claims data to obtain granular and accurate predicted values of pre-operative BMI, which may enhance confounding control and investigation of effect modification by baseline obesity levels in bariatric surgery studies utilizing claims data.Plain Language Summary: Pre-operative BMI is an important potential confounder in comparative effectiveness studies of bariatric surgeries.Claims data lack clinical measurements, but insurance reimbursement requirements for bariatric surgery often result in pre-operative BMI being coded in claims data.We used a machine learning pipeline to create a model, the B3S3, to predict pre-operative BMI, as documented in the EHR, among bariatric surgery patients based on the presence of certain weight-related diagnosis codes and other patient characteristics derived from claims data.Researchers can easily use the B3S3 with claims data to obtain granular and accurate predicted values of pre-operative BMI among bariatric surgery patients.Keywords: bariatric surgery, body mass index, confounding variable, comparative effectiveness research, administrative claims, supervised machine learning