Journal of Big Data (Jun 2019)
Scalable architecture for Big Data financial analytics: user-defined functions vs. SQL
Abstract
Abstract Large financial organizations have hundreds of millions of financial contracts on their balance sheets. Moreover, highly volatile financial markets and heterogeneous data sets within and across banks world-wide make near real-time financial analytics very challenging and their handling thus requires cutting edge financial algorithms. However, due to a lack of data modeling standards, current financial risk algorithms are typically inconsistent and non-scalable. In this paper, we present a novel implementation of a real-world use case for performing large-scale financial analytics leveraging Big Data technology. We first provide detailed background information on the financial underpinnings of our framework along with the major financial calculations. Afterwards we analyze the performance of different parallel implementations in Apache Spark based on existing computation kernels that apply the ACTUS data and algorithmic standard for financial contract modeling. The major contribution is a detailed discussion of the design trade-offs between applying user-defined functions on existing computation kernels vs. partially re-writing the kernel in SQL and thus taking advantage of the underlying SQL query optimizer. Our performance evaluation demonstrates almost linear scalability for the best design choice.
Keywords