Neuropsychopharmacology Reports (Mar 2024)

Machine learning algorithm‐based estimation model for the severity of depression assessed using Montgomery‐Asberg depression rating scale

  • Masanori Shimamoto,
  • Kanako Ishizuka,
  • Kento Ohtani,
  • Toshiya Inada,
  • Maeri Yamamoto,
  • Masako Tachibana,
  • Hiroki Kimura,
  • Yusuke Sakai,
  • Kazuhiro Kobayashi,
  • Norio Ozaki,
  • Masashi Ikeda

DOI
https://doi.org/10.1002/npr2.12404
Journal volume & issue
Vol. 44, no. 1
pp. 115 – 120

Abstract

Read online

Abstract Aim Depressive disorder is often evaluated using established rating scales. However, consistent data collection with these scales requires trained professionals. In the present study, the “rater & estimation‐system” reliability was assessed between consensus evaluation by trained psychiatrists and the estimation by 2 models of the AI‐MADRS (Montgomery‐Asberg Depression Rating Scale) estimation system, a machine learning algorithm‐based model developed to assess the severity of depression. Methods During interviews with trained psychiatrists and the AI‐MADRS estimation system, patients responded orally to machine‐generated voice prompts from the AI‐MADRS structured interview questions. The severity scores estimated from two models of the AI‐MADRS estimation system, the max estimation model and the average estimation model, were compared with those by trained psychiatrists. Results A total of 51 evaluation interviews conducted on 30 patients were analyzed. Pearson's correlation coefficient with the scores evaluated by trained psychiatrists was 0.76 (95% confidence interval 0.62–0.86) for the max estimation model, and 0.86 (0.76–0.92) for the average estimation model. The ANOVA ICC rater & estimation‐system reliability with the evaluation scores by trained psychiatrists was 0.51 (−0.09 to 0.79) for the max estimation model, and 0.75 (0.55–0.86) for the average estimation model. Conclusion The average estimation model of AI‐MADRS demonstrated substantially acceptable rater & estimation‐system reliability with trained psychiatrists. Accumulating a broader training dataset and the refinement of AI‐MADRS interviews are expected to improve the performance of AI‐MADRS. Our findings suggest that AI technologies can significantly modernize and potentially revolutionize the realm of depression assessments.

Keywords