Arabian Journal of Chemistry (Sep 2023)

Application of robust principal component analysis–multivariate adaptive regression splines for the determination of °API gravity in crude oil samples using ATR-FTIR spectroscopy

  • Mahsa Mohammadi,
  • Mohammadreza Khanmohammadi Khorrami

Journal volume & issue
Vol. 16, no. 9
p. 105083

Abstract

Read online

The robust principal component analysis-multivariate adaptive regression splines (r-PCA-MARS) has been applied and verified for analysis of the quantitative determination of American Petroleum Institute (°API) gravity values in crude oils. Seven principal component (PC) scores using 95.00% variance by principal component analysis (PCA) were applied as inputs in the MARS model. The calibration and prediction sets were obtained using duplex algorithm for the construction of the model and the then for the validation of the model. The calibration set (67*7) was used for the r-PCA-MARS model. The partial least squares regression (PLS-R), and support vector machine regression (SVM-R) models were utilized for comparison the quantitative value of the °API gravity in crude oils. In this paper, we also conducted a comparison study of Kennard-stone (KS) and duplex splitting methods on PLS-R and SVM-R models. The efficiency of the r-PCA-MARS model was evaluated using coefficient of determination (R2), R2 estimated by generalized cross-validation (R2GCV), root mean square error of calibration (RMSEC), root mean square error of prediction (RMSEP), and mean absolute error (MAE). The optimal r-PCA-MARS model uses 32 basis functions to characterize the °API gravity values in crude oils. The correlation coefficients value for calibration and prediction sets were 0.997 and 0.926, respectively. The RMSEC, RMSEP, MAE, and R2GCV in the piecewise-cubic r-PCA-MARS model was 6.726*10-13, 0.538, 0.290 and 0.988, respectively. According to the results, the r-PCA-MARS model provided high efficiency than commonly used regression models for prediction of °API gravity values in crude oils. The result of this study confirmed that the r-PCA-MARS model is the best model with more successful than the PLS-R and SVM-R models. It can be concluded that the r-PCA-MARS model is an appropriate model for describing the physicochemical properties of crude oil samples in the oil industry.

Keywords