Allergology International (Apr 2024)

Best practices for multimodal clinical data management and integration: An atopic dermatitis research case

  • Tazro Ohta,
  • Ayaka Hananoe,
  • Ayano Fukushima-Nomura,
  • Koichi Ashizaki,
  • Aiko Sekita,
  • Jun Seita,
  • Eiryo Kawakami,
  • Kazuhiro Sakurada,
  • Masayuki Amagai,
  • Haruhiko Koseki,
  • Hiroshi Kawasaki

Journal volume & issue
Vol. 73, no. 2
pp. 255 – 263

Abstract

Read online

Background: In clinical research on multifactorial diseases such as atopic dermatitis, data-driven medical research has become more widely used as means to clarify diverse pathological conditions and to realize precision medicine. However, modern clinical data, characterized as large-scale, multimodal, and multi-center, causes difficulties in data integration and management, which limits productivity in clinical data science. Methods: We designed a generic data management flow to collect, cleanse, and integrate data to handle different types of data generated at multiple institutions by 10 types of clinical studies. We developed MeDIA (Medical Data Integration Assistant), a software to browse the data in an integrated manner and extract subsets for analysis. Results: MeDIA integrates and visualizes data and information on research participants obtained from multiple studies. It then provides a sophisticated interface that supports data management and helps data scientists retrieve the data sets they need. Furthermore, the system promotes the use of unified terms such as identifiers or sampling dates to reduce the cost of pre-processing by data analysts. We also propose best practices in clinical data management flow, which we learned from the development and implementation of MeDIA. Conclusions: The MeDIA system solves the problem of multimodal clinical data integration, from complex text data such as medical records to big data such as omics data from a large number of patients. The system and the proposed best practices can be applied not only to allergic diseases but also to other diseases to promote data-driven medical research.

Keywords