Труды Института системного программирования РАН (Oct 2018)
Dynamic compilation of SQL queries for PostgreSQL
Abstract
In recent years, as performance and capacity of main and external memory grow, performance of database management systems (DBMSes) on certain kinds of queries is more determined by raw CPU speed. Currently, PostgreSQL uses the interpreter to execute SQL queries. This yields an overhead caused by indirect calls to handler functions and runtime checks, which could be avoided if the query were compiled into native code "on-the-fly", i.e. just-in-time (JIT) compiled: at run time the specific table structure is known as well as data types and built-in functions used in the query as well as the query itself. This is especially important for complex queries, performance of which is CPU-bound. We have developed a PostgreSQL extension that implements SQL query JIT compilation using LLVM compiler infrastructure. In this paper we show how to implement LLVM-analogues of the main operators of the PostgreSQL, how to replace Volcano iterator model abstraction (open(), next(), close()) by the abstraction that is more suitable to generate code for a particular query. Currently, with LLVM JIT we achieve up to 4.3x speedup on TPC-H Q1 query as compared to original PostgreSQL interpreter.
Keywords