Database Systems Journal (Jan 2025)
Query Completion for Small-Scale Distributed Databases in PostgreSQL and MongoDB
Abstract
Relational/SQL and document/JSON data stores are competing but complementary technologies in OLAP (On-Line Analytical Processing) systems. Whereas traditional approaches for performance comparison use query execution time, this paper compares two distributed setups deployed on PostgreSQL/Citus and MongoDB by focusing solely on query completion within a 10-minute timeout. The TPC-H benchmark was converted into a denormalized JSON schema in MongoDB. An initial set of 296 SQL queries was executed in PostgreSQL/Citus and then mapped to MongoDB’s Aggregation Framework. Completion success was collected across six scenarios defined by two data sizes (0.01 GB, 0.1 GB) and three node counts (3, 6, 9). Relationships between completion rates and query parameters were assessed using statistical tests and machine learning techniques.