Software (May 2024)

A MongoDB Document Reconstruction Support System Using Natural Language Processing

  • Kohei Hamaji,
  • Yukikazu Nakamoto

DOI
https://doi.org/10.3390/software3020010
Journal volume & issue
Vol. 3, no. 2
pp. 206 – 225

Abstract

Read online

Document-oriented databases, a type of Not Only SQL (NoSQL) database, are gaining popularity owing to their flexibility in data handling and performance for large-scale data. MongoDB, a typical document-oriented database, is a database that stores data in the JSON format, where the upper field involves lower fields and fields with the same related parent. One feature of this document-oriented database is that data are dynamically stored in an arbitrary location without explicitly defining a schema in advance. This flexibility violates the above property and causes difficulties for application program readability and database maintenance. To address these issues, we propose a reconstruction support method for document structures in MongoDB. The method uses the strength of the Has-A relationship between the parent and child fields, as well as the similarity of field names in the MongoDB documents in natural language processing, to reconstruct the data structure in MongoDB. As a result, the method transforms the parent and child fields into more coherent data structures. We evaluated our methods using real-world data and demonstrated their effectiveness.

Keywords