Remote Sensing (May 2022)

Strategies for the Storage of Large LiDAR Datasets—A Performance Comparison

  • Juan A. Béjar-Martos,
  • Antonio J. Rueda-Ruiz,
  • Carlos J. Ogayar-Anguita,
  • Rafael J. Segura-Sánchez,
  • Alfonso López-Ruiz

DOI
https://doi.org/10.3390/rs14112623
Journal volume & issue
Vol. 14, no. 11
p. 2623

Abstract

Read online

The widespread use of LiDAR technologies has led to an ever-increasing volume of captured data that pose a continuous challenge for its storage and organization, so that it can be efficiently processed and analyzed. Although the use of system files in formats such as LAS/LAZ is the most common solution for LiDAR data storage, databases are gaining in popularity due to their evident advantages: centralized and uniform access to a collection of datasets; better support for concurrent retrieval; distributed storage in database engines that allows sharding; and support for metadata or spatial queries by adequately indexing or organizing the data. The present work evaluates the performance of four popular NoSQL and relational database management systems with large LiDAR datasets: Cassandra, MongoDB, MySQL and PostgreSQL. To perform a realistic assessment, we integrate these database engines in a repository implementation with an elaborate data model that enables metadata and spatial queries and progressive/partial data retrieval. Our experimentation concludes that, as expected, NoSQL databases show a modest but significant performance difference in favor of NoSQL databases, and that Cassandra provides the best overall database solution for LiDAR data.

Keywords