We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
FunSAV: Predicting the Functional Effect of Single Amino Acid Variants Using a Two-Stage Random Forest Model.
- Authors
Mingjun Wang; Xing-Ming Zhao; Kazuhiro Takemoto; Haisong Xu; Yuan Li; Tatsuya Akutsu; Jiangning Song; Schonbach, Christian
- Abstract
Single amino acid variants (SAVs) are the most abundant form of known genetic variations associated with human disease. Successful prediction of the functional impact of SAVs from sequences can thus lead to an improved understanding of the underlying mechanisms of why a SAV may be associated with certain disease. In this work, we constructed a high-quality structural dataset that contained 679 high-quality protein structures with 2,048 SAVs by collecting the human genetic variant data from multiple resources and dividing them into two categories, i.e., disease-associated and neutral variants. We built a two- stage random forest (RF) model, termed as FunSAV, to predict the functional effect of SAVs by combining sequence, structure and residue-contact network features with other additional features that were not explored in previous studies. Importantly, a two-step feature selection procedure was proposed to select the most important and informative features that contribute to the prediction of disease association of SAVs. In cross-validation experiments on the benchmark dataset, FunSAV achieved a good prediction performance with the area under the curve (AUC) of 0.882, which is competitive with and in some cases better than other existing tools including SIFT, SNAP, Polyphen2, PANTHER, nsSNPAnalyzer and PhD-SNP. The sourcecodes of FunSAV and the datasets can be downloaded at http://sunflower.kuicr. kyoto-u.ac.jp/&sjn/FunSAV.
- Subjects
AMINO acids; HUMAN genetic variation; EXPERIMENTS; PERFORMANCE; DISEASES; PROTEIN structure
- Publication
PLoS ONE, 2012, Vol 7, Issue 8, p1
- ISSN
1932-6203
- Publication type
Article
- DOI
10.1371/journal.pone.0043847