Data augmentation using SMOTE technique: Application for prediction of burst pressure of hydrocarbons pipeline using supervised machine learning models

Afzal Ahmed Soomro; Ainul Akmar Mokhtar; Masdi B. Muhammad; Mohamad Hanif Md Saad; Najeebullah Lashari; Muhammad Hussain; Abdul Sattar Palli

Results in Engineering (Dec 2024)

Data augmentation using SMOTE technique: Application for prediction of burst pressure of hydrocarbons pipeline using supervised machine learning models

Afzal Ahmed Soomro,
Ainul Akmar Mokhtar,
Masdi B. Muhammad,
Mohamad Hanif Md Saad,
Najeebullah Lashari,
Muhammad Hussain,
Abdul Sattar Palli

Affiliations

Afzal Ahmed Soomro: Department of Mechanical and Manufacturing Engineering, Universiti Kebangsaan Malaysia, Bangi 43600, Selangor, Malaysia; Mechanical Engineering Department, Universiti Teknologi PETRONAS, Seri Iskandar, Perak Darul, Ridzuan 32610, Malaysia; Corresponding author at: Mechanical Engineering Department, Universiti Teknologi PETRONAS, Seri Iskandar, Perak Darul, Ridzuan 32610, Malaysia.
Ainul Akmar Mokhtar: Mechanical Engineering Department, Universiti Teknologi PETRONAS, Seri Iskandar, Perak Darul, Ridzuan 32610, Malaysia
Masdi B. Muhammad: Mechanical Engineering Department, Universiti Teknologi PETRONAS, Seri Iskandar, Perak Darul, Ridzuan 32610, Malaysia
Mohamad Hanif Md Saad: Department of Mechanical and Manufacturing Engineering, Universiti Kebangsaan Malaysia, Bangi 43600, Selangor, Malaysia
Najeebullah Lashari: Petroleum and Gas Engineering Department, Dawood University of Engineering & Technology, M.A Jinnah Road, Karachi 74800, Pakistan; Petroleum Engineering Department, Universiti Teknologi PETRONAS, Seri Iskandar, Perak Darul, Ridzuan 32610, Malaysia
Muhammad Hussain: Northfields Ave Wollongong, University of Wollongong, NSW 2522, Australia
Abdul Sattar Palli: Computer and Information Science Department, Universiti Teknologi PETRONAS, Seri Iskandar, Perak Darul, Ridzuan 32610, Malaysia

Journal volume & issue: Vol. 24
p. 103233

Abstract

Read online

Accurate burst pressure prediction is critical for ensuring oil and gas pipeline safety, guiding maintenance decisions, and lowering costs and risks. Traditional methods have limitations, including high experimental costs, conservative empirical models, and computationally expensive numerical algorithms. Machine learning (ML) models have supplanted traditional methods in recent years. However, small and imbalanced datasets are the big challenge to build a ML model that can generate more accurate results. Moreover, the lack of generalization in ML models trained on a dataset of pipelines with specific material grids prevents them from producing superior results on other pipeline types. First, FEA was used to make a dataset. Then, a new way to improve machine learning (ML) model generalization for burst pressure prediction is suggested: combine publicly available datasets of different pipeline specifications. In this combined dataset, some pipelines have a higher number of data samples, and some have fewer, which causes a class imbalance issue. The Synthetic Minority Oversampling Technique (SMOTE) technique was applied to address the issue of class imbalance. The performance of various ML models, Extra Trees (ET), Extreme Gradient Boosting (XGBR), Random Forest (RF), Light Gradient Boosting Machine (LGBM), and Decision Tree (DT), was evaluated to validate the model's prediction and generalization on pipelines of various material grids. Results show that all the selected ML models produced high R-squared, i.e., >0.95, on balanced data compared to the imbalance dataset. These results show that SMOTE-based augmentation is a beneficial way to fix dataset imbalance and make ML models better at predicting burst pressure in oil and gas pipelines.

Published in Results in Engineering

ISSN: 2590-1230 (Online)
Publisher: Elsevier
Country of publisher: Netherlands
LCC subjects: Technology
Website: https://www.journals.elsevier.com/results-in-engineering

About the journal

Abstract

Keywords