Ojeda, Francisco M.; Baker, Stuart G.; Ziegler, Andreas

doi:10.1002/sim.10051

Back to matches

Your institution may have rights to this item. Sign in to continue.

Title: Calibrating machine learning approaches for probability estimation: A short expansion.
Authors: Ojeda, Francisco M.; Baker, Stuart G.; Ziegler, Andreas
Abstract: This article explores the importance of calibration in prediction models when applied to new populations. It compares different calibration approaches and introduces Elkan's general updating approach (GUA), which accounts for differences in base rates between populations without re-estimation. The article explains how calibrated probability can be expressed as a function of base rates, which are estimated using proportions of subjects with a certain outcome in training and calibration data. The text discusses two approaches for estimating base probability in a training data set: Elkan's GUA calculates the average of the target variable, while Baker's GUA suggests using the average of the conditional probability. Both approaches assume that the only difference between populations is the change in base rate and require a calibration data set. A simulation study comparing different calibration methods for machine learning models found that beta calibration performed the best, while isotonic calibration and IVAP calibration were similar. Elkan calibration and Baker calibration also showed similar performance. The article provides detailed information on the simulation scenarios and the performance of each calibration method.
Subjects: RECEIVER operating characteristic curves; INDEPENDENT variables; BISECTORS (Geometry); SUPPORT vector machines; MACHINE learning
Publication: Statistics in Medicine, 2024, Vol 43, Issue 21, p4212
ISSN: 0277-6715
Publication type: Article
DOI: 10.1002/sim.10051

We found a match

Calibrating machine learning approaches for probability estimation: A short expansion.

Ojeda, Francisco M.; Baker, Stuart G.; Ziegler, Andreas

RECEIVER operating characteristic curves; INDEPENDENT variables; BISECTORS (Geometry); SUPPORT vector machines; MACHINE learning

Statistics in Medicine, 2024, Vol 43, Issue 21, p4212

0277-6715

Article

10.1002/sim.10051