PLoS ONE (Jan 2017)

Identifying maternal and infant factors associated with newborn size in rural Bangladesh by partial least squares (PLS) regression analysis.

  • Alamgir Kabir,
  • Md Jahanur Rahman,
  • Abu Ahmed Shamim,
  • Rolf D W Klemm,
  • Alain B Labrique,
  • Mahbubur Rashid,
  • Parul Christian,
  • Keith P West

DOI
https://doi.org/10.1371/journal.pone.0189677
Journal volume & issue
Vol. 12, no. 12
p. e0189677

Abstract

Read online

Birth weight, length and circumferences of the head, chest and arm are key measures of newborn size and health in developing countries. We assessed maternal socio-demographic factors associated with multiple measures of newborn size in a large rural population in Bangladesh using partial least squares (PLS) regression method. PLS regression, combining features from principal component analysis and multiple linear regression, is a multivariate technique with an ability to handle multicollinearity while simultaneously handling multiple dependent variables. We analyzed maternal and infant data from singletons (n = 14,506) born during a double-masked, cluster-randomized, placebo-controlled maternal vitamin A or β-carotene supplementation trial in rural northwest Bangladesh. PLS regression results identified numerous maternal factors (parity, age, early pregnancy MUAC, living standard index, years of education, number of antenatal care visits, preterm delivery and infant sex) significantly (p<0.001) associated with newborn size. Among them, preterm delivery had the largest negative influence on newborn size (Standardized β = -0.29 - -0.19; p<0.001). Scatter plots of the scores of first two PLS components also revealed an interaction between newborn sex and preterm delivery on birth size. PLS regression was found to be more parsimonious than both ordinary least squares regression and principal component regression. It also provided more stable estimates than the ordinary least squares regression and provided the effect measure of the covariates with greater accuracy as it accounts for the correlation among the covariates and outcomes. Therefore, PLS regression is recommended when either there are multiple outcome measurements in the same study, or the covariates are correlated, or both situations exist in a dataset.