Ensemble learning for impurity prediction in high-purity indium purified via vertical zone refining

Zhongwen Shang; Meizhen Wu; Jubo Peng; Hongxing Zheng

Intelligent Systems with Applications (Jun 2024)

Ensemble learning for impurity prediction in high-purity indium purified via vertical zone refining

Zhongwen Shang,
Meizhen Wu,
Jubo Peng,
Hongxing Zheng

Affiliations

Zhongwen Shang: Shanghai Engineering Research Center for Integrated Circuits and Advanced Display Materials, Shanghai University, Shanghai 200444, China; School of Materials Science and Engineering, Shanghai University, Shanghai 200444, China
Meizhen Wu: Research & Development Center, Yunnan Tin Group (Holding) Limited Company, Kunming 650032, Yunnan, China
Jubo Peng: Research & Development Center, Yunnan Tin Group (Holding) Limited Company, Kunming 650032, Yunnan, China
Hongxing Zheng: Shanghai Engineering Research Center for Integrated Circuits and Advanced Display Materials, Shanghai University, Shanghai 200444, China; School of Materials Science and Engineering, Shanghai University, Shanghai 200444, China; Corresponding author.

Journal volume & issue: Vol. 22
p. 200390

Abstract

Read online

The complexity of raw materials and multi-step purification processes presents considerable technical challenges in establishing universally applicable process parameters for the production of high-purity metals. Machine learning has emerged as an indispensable tool in the field of materials science, facilitating the accurate prediction of target variables and accelerating process optimization, thereby yielding substantial reductions in both experimental costs and time. This study explores the utilization of high-precision machine learning models to predict the residual impurity content in high-purity indium after vertical zone refining. A dataset comprising 82 experimental datasets was employed to determine the optimal hyperparameters for XGBoost and LightGBM models through Bayesian optimization. The XGBoost and LightGBM models demonstrated mean absolute errors (MAEs) of 0.022 and 0.023, respectively, as determined via leave-one-out cross-validation (LOOCV). Their comparable predictive performance to the previously established Ridge regression model (MAE = 0.024) prompted the exploration of fusion techniques, including mean, weighted, and stacking fusion, to further enhance accuracy. Remarkably, the weighted fusion model exhibited the most optimal predictive capabilities, supported by comprehensive evaluation metrics, including an MAE of 0.020, root mean squared error (RMSE) of 0.026, and a coefficient of determination (R2 score) of 0.830. Furthermore, the SHapley Additive exPlanations (SHAP) analysis revealed a significant correlation between lower initial arsenic (As) content and reduced total post-refining impurity levels in both the XGBoost and LightGBM models. This study underscores the precision of ensemble learning in predicting residual impurity content in vertically zone-refined indium products.

Published in Intelligent Systems with Applications

ISSN: 2667-3053 (Online)
Publisher: Elsevier
Country of publisher: United Kingdom
LCC subjects: Science: Science (General): Cybernetics; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.journals.elsevier.com/intelligent-systems-with-applications

About the journal

Abstract

Keywords