Informatics in Medicine Unlocked (Jan 2018)

An optimal big data workflow for biomedical image analysis

  • Aurelle Tchagna Kouanou,
  • Daniel Tchiotsop,
  • Romanic Kengne,
  • Djoufack Tansaa Zephirin,
  • Ngo Mouelas Adele Armele,
  • René Tchinda

Journal volume & issue
Vol. 11
pp. 68 – 74

Abstract

Read online

Background and objective: In the medical field, data volume is increasingly growing, and traditional methods cannot manage it efficiently. In biomedical computation, the continuous challenges are: management, analysis, and storage of the biomedical data. Nowadays, big data technology plays a significant role in the management, organization, and analysis of data, using machine learning and artificial intelligence techniques. It also allows a quick access to data using the NoSQL database. Thus, big data technologies include new frameworks to process medical data in a manner similar to biomedical images. It becomes very important to develop methods and/or architectures based on big data technologies, for a complete processing of biomedical image data. Method: This paper describes big data analytics for biomedical images, shows examples reported in the literature, briefly discusses new methods used in processing, and offers conclusions. We argue for adapting and extending related work methods in the field of big data software, using Hadoop and Spark frameworks. These provide an optimal and efficient architecture for biomedical image analysis. This paper thus gives a broad overview of big data analytics to automate biomedical image diagnosis. A workflow with optimal methods and algorithm for each step is proposed. Results: Two architectures for image classification are suggested. We use the Hadoop framework to design the first, and the Spark framework for the second. The proposed Spark architecture allows us to develop appropriate and efficient methods to leverage a large number of images for classification, which can be customized with respect to each other. Conclusions: The proposed architectures are more complete, easier, and are adaptable in all of the steps from conception. The obtained Spark architecture is the most complete, because it facilitates the implementation of algorithms with its embedded libraries. Keywords: Biomedical images, Big data, Artificial intelligence, Machine learning, Hadoop/spark