Multivariable Mendelian randomization with incomplete measurements on the exposure variables in the Hispanic Community Health Study/Study of Latinos
Yilun Li,
Kin Yau Wong,
Annie Green Howard,
Penny Gordon-Larsen,
Heather M. Highland,
Mariaelisa Graff,
Kari E. North,
Carolina G. Downie,
Christy L. Avery,
Bing Yu,
Kristin L. Young,
Victoria L. Buchanan,
Robert Kaplan,
Lifang Hou,
Brian Thomas Joyce,
Qibin Qi,
Tamar Sofer,
Jee-Young Moon,
Dan-Yu Lin
Affiliations
Yilun Li
Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
Kin Yau Wong
Department of Applied Mathematics, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong
Annie Green Howard
Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA; Carolina Population Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
Penny Gordon-Larsen
Carolina Population Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA; Department of Nutrition, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
Heather M. Highland
Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
Mariaelisa Graff
Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
Kari E. North
Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
Carolina G. Downie
Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
Christy L. Avery
Carolina Population Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA; Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
Bing Yu
Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, University of Texas Health Science Center at Houston, Houston, TX 77030, USA
Kristin L. Young
Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
Victoria L. Buchanan
Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
Robert Kaplan
Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY 10461, USA; Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA
Lifang Hou
Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA
Brian Thomas Joyce
Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA
Qibin Qi
Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY 10461, USA
Tamar Sofer
Department of Medicine, Harvard Medical School, Boston, MA 02115, USA
Jee-Young Moon
Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY 10461, USA
Dan-Yu Lin
Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA; Corresponding author
Summary: Multivariable Mendelian randomization allows simultaneous estimation of direct causal effects of multiple exposure variables on an outcome. When the exposure variables of interest are quantitative omic features, obtaining complete data can be economically and technically challenging: the measurement cost is high, and the measurement devices may have inherent detection limits. In this paper, we propose a valid and efficient method to handle unmeasured and undetectable values of the exposure variables in a one-sample multivariable Mendelian randomization analysis with individual-level data. We estimate the direct causal effects with maximum likelihood estimation and develop an expectation-maximization algorithm to compute the estimators. We show the advantages of the proposed method through simulation studies and provide an application to the Hispanic Community Health Study/Study of Latinos, which has a large amount of unmeasured exposure data.