PLOS Digital Health (Jun 2024)

Explainable deep learning for disease activity prediction in chronic inflammatory joint diseases.

  • Cécile Trottet,
  • Ahmed Allam,
  • Aron N Horvath,
  • Axel Finckh,
  • Thomas Hügle,
  • Sabine Adler,
  • Diego Kyburz,
  • Raphael Micheroli,
  • Michael Krauthammer,
  • Caroline Ospelt

DOI
https://doi.org/10.1371/journal.pdig.0000422
Journal volume & issue
Vol. 3, no. 6
p. e0000422

Abstract

Read online

Analysing complex diseases such as chronic inflammatory joint diseases (CIJDs), where many factors influence the disease evolution over time, is a challenging task. CIJDs are rheumatic diseases that cause the immune system to attack healthy organs, mainly the joints. Different environmental, genetic and demographic factors affect disease development and progression. The Swiss Clinical Quality Management in Rheumatic Diseases (SCQM) Foundation maintains a national database of CIJDs documenting the disease management over time for 19'267 patients. We propose the Disease Activity Score Network (DAS-Net), an explainable multi-task learning model trained on patients' data with different arthritis subtypes, transforming longitudinal patient journeys into comparable representations and predicting multiple disease activity scores. First, we built a modular model composed of feed-forward neural networks, long short-term memory networks and attention layers to process the heterogeneous patient histories and predict future disease activity. Second, we investigated the utility of the model's computed patient representations (latent embeddings) to identify patients with similar disease progression. Third, we enhanced the explainability of our model by analysing the impact of different patient characteristics on disease progression and contrasted our model outcomes with medical expert knowledge. To this end, we explored multiple feature attribution methods including SHAP, attention attribution and feature weighting using case-based similarity. Our model outperforms temporal and non-temporal neural network, tree-based, and naive static baselines in predicting future disease activity scores. To identify similar patients, a k-nearest neighbours regression algorithm applied to the model's computed latent representations outperforms baseline strategies that use raw input features representation.