Heliyon (Aug 2024)
A comprehensive study of coefficient signs in weighted logistic regression
Abstract
In this paper, we explore the coefficient signs in weighted logistic regression, a variation of logistic regression that includes positive weights and is commonly used for handling uneven data sets and reject inference in credit scoring. Initially, we examine simple weighted logistic regression. Assuming full rank and overlap, we demonstrate that the slope's sign matches the sign of the difference in weighted averages of the independent variable across two groups, 1 and 0. We extend this analysis to multiple weighted logistic regression by employing two vectors: one representing the slopes and the other the differences in weighted averages of the independent variables across the groups. We establish that if one vector is zero, the other must also be zero. Additionally, we prove that if the slope vector isn't zero, the angle between these vectors will be acute. Our theoretical results can serve as a preliminary step prior to feature selection, which is important in logistic regression. Our numerical analysis further illustrates how our theoretical results can be applied to the well-known German Credit Data for reject inference. Additionally, we provide a detailed explanation of feature selection in our analysis.