Entropy (Jul 2024)

NodeFlow: Towards End-to-End Flexible Probabilistic Regression on Tabular Data

  • Patryk Wielopolski,
  • Oleksii Furman,
  • Maciej Zięba

DOI
https://doi.org/10.3390/e26070593
Journal volume & issue
Vol. 26, no. 7
p. 593

Abstract

Read online

We introduce NodeFlow, a flexible framework for probabilistic regression on tabular data that combines Neural Oblivious Decision Ensembles (NODEs) and Conditional Continuous Normalizing Flows (CNFs). It offers improved modeling capabilities for arbitrary probabilistic distributions, addressing the limitations of traditional parametric approaches. In NodeFlow, the NODE captures complex relationships in tabular data through a tree-like structure, while the conditional CNF utilizes the NODE’s output space as a conditioning factor. The training process of NodeFlow employs standard gradient-based learning, facilitating the end-to-end optimization of the NODEs and CNF-based density estimation. This approach ensures outstanding performance, ease of implementation, and scalability, making NodeFlow an appealing choice for practitioners and researchers. Comprehensive assessments on benchmark datasets underscore NodeFlow’s efficacy, revealing its achievement of state-of-the-art outcomes in multivariate probabilistic regression setup and its strong performance in univariate regression tasks. Furthermore, ablation studies are conducted to justify the design choices of NodeFlow. In conclusion, NodeFlow’s end-to-end training process and strong performance make it a compelling solution for practitioners and researchers. Additionally, it opens new avenues for research and application in the field of probabilistic regression on tabular data.

Keywords