OncoTargets and Therapy (Aug 2015)

Screening of feature genes in distinguishing different types of breast cancer using support vector machine

  • Wang Q,
  • Liu XD

Journal volume & issue
Vol. 2015, no. default
pp. 2311 – 2317

Abstract

Read online

Qi Wang, Xudong Liu Department of Emergency Surgery, Affiliated Hospital of Inner Mongolia Medical University, Hohhot, People’s Republic of China Objective: To screen the feature genes in estrogen receptor-positive (ER+) breast cancer in comparison with estrogen receptor-negative (ER-) breast cancer. Methods: Nine microarray data of ER+ and ER- breast cancer samples were collected from Gene Expression Omnibus database. After preprocessing, data in five training sets were analyzed using significance analysis of microarrays to screen the differentially expressed genes (DEGs). The DEGs were further analyzed via support vector machine (SVM) function in e1071 package of R to construct a SVM classifier, the efficacy of which was verified by four testing sets and its combination with training sets using a leave-one-out cross-validation. Feature genes obtained by SVM classifier were subjected to function- and pathway-enrichment via the Database for Annotation, Visualization and Integrated Discovery and KEGG Orthology Based Annotation System, respectively. Results: A total of 526 DEGs were screened between ER+ and ER- breast cancer. The SVM classifier demonstrated that these genes could distinguish different subtype samples with high accuracy of larger than 90%, and also showed good sensitivity, specificity, positive/negative predictive value, and area under receiver operating characteristic curve. The inflammatory and hormone biological processes were the common enriched results for two different function analyses, indicating that the inflammatory (ie, IL8) and hormone regulation (ie, CGA) genes may be the involved feature genes to distinguish ER+ and ER- types of breast cancer. Conclusion: The gene-expression profile data can provide feature genes to distinguish ER+ and ER- samples, and the identified genes can be used for biomarkers for ER+ samples. Keywords: classification, differentially expressed genes, biomarker