大数据 (Jan 2024)
Data expansion method for genetic engineering of special materials with small sample data
Abstract
With the increasing diversity and complexity of material requirements for underground water conservancy and water pipeline networks, the efficient and convenient design of special materials to meet individual needs through machine learning has become a hot topic of concern.Traditional supervised learning methods are all based on a large dataset to train models, but obtaining large datasets for special materials required in deeply buried underground water pipeline networks and high-end military equipment, such as rare and high-entropy alloys, etc.requires extremely high cost and a long period.To solve this problem, we propose a small sample expansion model-RX-SMOGN, using XGBoost and RFECV algorithms for feature screening.We enrich the dataset with the SMOGN algorithm.In this paper, the phase structure of high-entropy alloys is used as the research object, and traditional machine learning models are trained to predict them to verify the effectiveness of the RX-SMOGN model.From the results of 5-fold cross-verification and 4 evaluation indicators, it can be seen that the RX-SMOGN model fully improves the performance of the machine learning model, provides a more convenient method for alloy material design, and fully improves the efficiency of alloy material design.