We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
TUNING THE PRECISION OF PREDICTORS TO REDUCE OVERESTIMATION OF PROTEIN DISORDER OVER LARGE DATASETS.
- Authors
DEIANA, ANTONIO; GIANSANTI, ANDREA
- Abstract
This is a study on the precision of four known protein disorder predictors, ranked among the best-performing ones: DISOPRED2, PONDR VSL2B, IUPred and ESpritz. We address here the problem of a systematic overestimation of the number of disordered proteins recognized through the use of these predictors, considered as a standard. Some of these predictors, used with their default setting, have a low precision, implying a tendency to overestimate the oc-currence of disordered proteins in genome-wide surveys. Moreover, different predictors often disagree on the evaluation of individual proteins. To cope with this problem and in order to propose a simple procedure that enhances precision based on precision-recall curves, we re-tuned the discriminative thresholds of the predictors by training and cross-validating their perfor-mance on a cured dataset. After re-tuning, both the disagreement among predictors and the tendency to overestimate the occurrence of disordered proteins are reduced. This is shown in a dedicated study over the human proteome and a set of cancer-related human proteins, with no a priori disorder annotation. Simple quantitative estimates suggest that the occurrence of dis-order among cancer-related proteins and other similar large-scale surveys has been over-estimated in the past.
- Subjects
PROTEOMICS; PROTEIN structure; DENATURATION of proteins; PROTEIN genetics; TUMOR proteins; QUANTITATIVE research; COMPUTATIONAL biology
- Publication
Journal of Bioinformatics & Computational Biology, 2013, Vol 11, Issue 2, p1
- ISSN
0219-7200
- Publication type
Academic Journal
- DOI
10.1142/S0219720012500230