BMC Medical Research Methodology (Oct 2010)

Polytomous diagnosis of ovarian tumors as benign, borderline, primary invasive or metastatic: development and validation of standard and kernel-based risk prediction models

  • Testa Antonia C,
  • Van Holsbeke Caroline,
  • Valentin Lil,
  • Van Calster Ben,
  • Bourne Tom,
  • Van Huffel Sabine,
  • Timmerman Dirk

DOI
https://doi.org/10.1186/1471-2288-10-96
Journal volume & issue
Vol. 10, no. 1
p. 96

Abstract

Read online

Abstract Background Hitherto, risk prediction models for preoperative ultrasound-based diagnosis of ovarian tumors were dichotomous (benign versus malignant). We develop and validate polytomous models (models that predict more than two events) to diagnose ovarian tumors as benign, borderline, primary invasive or metastatic invasive. The main focus is on how different types of models perform and compare. Methods A multi-center dataset containing 1066 women was used for model development and internal validation, whilst another multi-center dataset of 1938 women was used for temporal and external validation. Models were based on standard logistic regression and on penalized kernel-based algorithms (least squares support vector machines and kernel logistic regression). We used true polytomous models as well as combinations of dichotomous models based on the 'pairwise coupling' technique to produce polytomous risk estimates. Careful variable selection was performed, based largely on cross-validated c-index estimates. Model performance was assessed with the dichotomous c-index (i.e. the area under the ROC curve) and a polytomous extension, and with calibration graphs. Results For all models, between 9 and 11 predictors were selected. Internal validation was successful with polytomous c-indexes between 0.64 and 0.69. For the best model dichotomous c-indexes were between 0.73 (primary invasive vs metastatic) and 0.96 (borderline vs metastatic). On temporal and external validation, overall discrimination performance was good with polytomous c-indexes between 0.57 and 0.64. However, discrimination between primary and metastatic invasive tumors decreased to near random levels. Standard logistic regression performed well in comparison with advanced algorithms, and combining dichotomous models performed well in comparison with true polytomous models. The best model was a combination of dichotomous logistic regression models. This model is available online. Conclusions We have developed models that successfully discriminate between benign, borderline, and invasive ovarian tumors. Methodologically, the combination of dichotomous models was an interesting approach to tackle the polytomous problem. Standard logistic regression models were not outperformed by regularized kernel-based alternatives, a finding to which the careful variable selection procedure will have contributed. The random discrimination between primary and metastatic invasive tumors on temporal/external validation demonstrated once more the necessity of validation studies.