Journal of Universal Computer Science (Nov 2021)
Leveraging multifaceted proximity measures among developers in predicting future collaborations to improve the social capital of software projects
Abstract
Read online Read online Read online
Social capital is an asset earned by people through their social connections. One of the motivations among developers to contribute to open source development and maintenance tasks is to earn social capital. Recent studies suggest that the social capital of the project has an impact on the sustained participation of the developers in open source software (OSS). One way to improve the social capital of the project is to help the developers in connecting with their peers. However, to the best of our knowledge, there is no prior research which attempts to predict future collaborations among developers and establish the significance of these collaborations on improving the social capital at the project level. To address this research gap, in this paper, we model the past collaborations among developers on version control system (VCS) and issue tracking system (ITS) as homogeneous and heterogeneous developer social network (DSN). Along with the novel path count based features, defined on proposed heterogeneous DSN, multifaceted proximity features are used to generate a feature set for machine learning classifiers. Our experiments performed on 5 popular open source projects (Spark, Kafka, Flink, WildFly, Hibernate) indicate that the proposed approach can predict the future collaborations among developers on both the platforms i.e. VCS as well as ITS with a significant accuracy (AUROC up to 0.85 and 0.9 for VCS and ITS respectively). A generic metric- recall of gain in social capital is proposed to investigate the efficacy of these predicted collaborations in improving the social capital of the project. We also concretised this metric on various measures of social capital and found that collaborations predicted by our approach have significant potential to improve the social capital at project level (e.g. Recall of gain in cohesion index up to 0.98 and Recall of gain in average godfather index up to 0.99 for VCS). We also showed that structure of collaboration network has an impact on the accuracy and usefulness of predicted collaborations. Since the past research suggests that many newcomers abandon the open source project due to social barriers which they face after joining the project, our research outcomes can be used to build the recommendation systems which might help to retain such developers by improving their social ties based on similar skills/interests.
Keywords