Machine Learning with Applications (Mar 2023)

Predicting firm performance and size using machine learning with a Bayesian perspective

  • Debdatta Saha,
  • Timothy M. Young,
  • Jessica Thacker

Journal volume & issue
Vol. 11
p. 100453

Abstract

Read online

This paper investigates the issue of predicting the financial performance of firms in registered manufacturing in developing countries using machine learning methods along with economic theory to explain the findings. While literature suggests that predictability of top line measures related to sales is lower compared to bottom line measures such as net profits for small informal establishments in developing countries, we find the opposite holds true for firms in registered manufacturing in the food processing industry in India based on the results from machine learning techniques of Bayesian additive regression trees (BART), boosted trees, bootstrap forests, and regression tree algorithms. BART models in validation outperformed the other algorithms in predictability of the dependent variables. Across ten validation studies, BART had an average R2 ranging from 0.922 to 0.934 and boosted tree models had an average R2 ranging from 0.873 to 0.905 for predicting sales. A key significant independent variable for predicting sales across all categories and algorithms was real raw material expenses explaining approximately 83% to 88% of the total sums of squares in all validations. This is in line with the realities of the food processing industry, which is intensive in its raw material usage. The dependent variable ‘profits’ (as measured by real profits before depreciation, interest, taxes and amortization, ‘real_pbdita’) was more difficult to predict relative to sales. BART models again outperformed the other algorithms in validation with an average R2 ranging from 0.745 to 0.818. Key significant variables in the models were more diverse where raw material expenses, compensation to employees, and net- or total-fixed assets explained the largest proportions of the total sums of squares. The results from a machine learning approach with a Bayesian perspective can enhance the understanding of the mechanisms that translate sales into profits for registered manufacturing, thereby aiding policy-making for small businesses in the formal sector in developing countries.

Keywords