Revista Brasileira de Zootecnia (Nov 2024)
Comparison of supervised machine learning and variable selection methods for body weight prediction of growth pigs using image processing data
Abstract
ABSTRACT This research aimed to compare statistical methods (random forest, RIDGE, LASSO, and elastic net regression) for the prediction of body weight in purebred and crossbred pigs reared in Brazil. This prediction was based on dorsal-view images obtained from video image processing. The study involved 69 animals belonging to breeds such as Large White, Piau, Duroc × Large White, and Piau × Large White. The data collection spanned 144 days, with measurements taken at approximately 20-day intervals, totaling eight measurements for each animal throughout their growth stages. Image acquisition was carried out in individual pens using an Intel RealSense Depth D435 digital camera. The features back area, back perimeter, back width, and body depth were extracted from the images. Pearson’s correlation analysis was conducted to assess the relationship between live weight and these features. The dataset was randomly divided into a training dataset (65%) and a test dataset (35%), and model training was performed by five-fold cross-validation balanced according to the growth stage, which was divided into three groups. This procedure was repeated 100 times, and the resulting metrics were taken as the average of the 100 repetitions. Although with a slight difference, the random forest method outperformed the others with the highest average R2 value (0.87), as well as the lowest average RMSE (14.32) and average MAE (10.13) values. Consequently, the random forest algorithm proved to be the most effective in predicting body weight. The back area, back width, and back perimeter were the most important variables in the model.
Keywords