We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
NCC-AUC: an AUC optimization method to identify multi-biomarker panel for cancer prognosis from genomic and clinical data.
- Authors
Meng Zou; Zhaoqi Liu; Xiang-Sun Zhang; Yong Wang
- Abstract
Motivation: In prognosis and survival studies, an important goal is to identify multi-biomarker panels with predictive power using molecular characteristics or clinical observations. Such analysis is often challenged by censored, small-sample-size, but high-dimensional genomic profiles or clinical data. Therefore, sophisticated models and algorithms are in pressing need. Results: In this study, we propose a novel Area Under Curve (AUC) optimization method for multi-biomarker panel identification named Nearest Centroid Classifier for AUC optimization (NCC-AUC). Our method is motived by the connection between AUC score for classification accuracy evaluation and Harrell's concordance index in survival analysis. This connection allows us to convert the survival time regression problem to a binary classification problem. Then an optimization model is formulated to directly maximize AUC and meanwhile minimize the number of selected features to construct a predictor in the nearest centroid classifier framework. NCC-AUC shows its great performance by validating both in genomic data of breast cancer and clinical data of stage IB Non-Small-Cell Lung Cancer (NSCLC). For the genomic data, NCC-AUC outperforms Support Vector Machine (SVM) and Support Vector Machine-based Recursive Feature Elimination (SVM-RFE) in classification accuracy. It tends to select a multi-biomarker panel with low average redundancy and enriched biological meanings. Also NCC-AUC is more significant in separation of low and high risk cohorts than widely used Cox model (Cox proportional-hazards regression model) and L1-Cox model (L1 penalized in Cox model). These performance gains of NCC-AUC are quite robust across 5 subtypes of breast cancer. Further in an independent clinical data, NCC-AUC outperforms SVM and SVM-RFE in predictive accuracy and is consistently better than Cox model and L1-Cox model in grouping patients into high and low risk categories. Conclusion: In summary, NCC-AUC provides a rigorous optimization framework to systematically reveal multi-biomarker panel from genomic and clinical data. It can serve as a useful tool to identify prognostic biomarkers for survival analysis. Availability and implementation: NCC-AUC is available at http://doc.aporc.org/wiki/NCC-AUC.
- Subjects
BIOMARKERS; CANCER prognosis; SURVIVAL analysis (Biometry); SUPPORT vector machines; ALGORITHMS; FEATURE selection
- Publication
Bioinformatics, 2015, Vol 31, Issue 20, p3330
- ISSN
1367-4803
- Publication type
Article
- DOI
10.1093/bioinformatics/btv374