Arthritis Research & Therapy (Oct 2017)

Biological function integrated prediction of severe radiographic progression in rheumatoid arthritis: a nested case control study

  • Young Bin Joo,
  • Yul Kim,
  • Youngho Park,
  • Kwangwoo Kim,
  • Jeong Ah Ryu,
  • Seunghun Lee,
  • So-Young Bang,
  • Hye-Soon Lee,
  • Gwan-Su Yi,
  • Sang-Cheol Bae

DOI
https://doi.org/10.1186/s13075-017-1414-x
Journal volume & issue
Vol. 19, no. 1
pp. 1 – 9

Abstract

Read online

Abstract Background Radiographic progression is reported to be highly heritable in rheumatoid arthritis (RA). However, previous study using genetic loci showed an insufficient accuracy of prediction for radiographic progression. The aim of this study is to identify a biologically relevant prediction model of radiographic progression in patients with RA using a genome-wide association study (GWAS) combined with bioinformatics analysis. Methods We obtained genome-wide single nucleotide polymorphism (SNP) data for 374 Korean patients with RA using Illumina HumanOmni2.5Exome-8 arrays. Radiographic progression was measured using the yearly Sharp/van der Heijde modified score rate, and categorized in no or severe progression. Significant SNPs for severe radiographic progression from GWAS were mapped on the functional genes and reprioritized by post-GWAS analysis. For robust prediction of radiographic progression, tenfold cross-validation using a support vector machine (SVM) classifier was conducted. Accuracy was used for selection of optimal SNPs set in the Hanyang Bae RA cohort. The performance of our final model was compared with that of other models based on GWAS results and SPOT (one of the post-GWAS analyses) using receiver operating characteristic (ROC) curves. The reliability of our model was confirmed using GWAS data of Caucasian patients with RA. Results A total of 36,091 significant SNPs with a p value <0.05 from GWAS were reprioritized using post-GWAS analysis and approximately 2700 were identified as SNPs related to RA biological features. The best average accuracy of ten groups was 0.6015 with 85 SNPs, and this increased to 0.7481 when combined with clinical information. In comparisons of the performance of the model, the 0.7872 area under the curve (AUC) in our model was superior to that obtained with GWAS (AUC 0.6586, p value 8.97 × 10-5) or SPOT (AUC 0.7449, p value 0.0423). Our model strategy also showed superior prediction accuracy in Caucasian patients with RA compared with GWAS (p value 0.0049) and SPOT (p value 0.0151). Conclusions Using various biological functions of SNPs and repeated machine learning, our model could predict severe radiographic progression relevantly and robustly in patients with RA compared with models using only GWAS results or other post-GWAS tools.

Keywords