Hemelings, Ruben; Elen, Bart; Schuster, Alexander K.; Blaschko, Matthew B.; Barbosa-Breda, João; Hujanen, Pekko; Junglas, Annika; Nickels, Stefan; White, Andrew; Pfeiffer, Norbert; Mitchell, Paul; De Boever, Patrick; Tuulonen, Anja; Stalmans, Ingeborg

doi:10.1038/s41746-023-00857-0

Back to matches

Your institution may have rights to this item. Sign in to continue.

Title: A generalizable deep learning regression model for automated glaucoma screening from fundus images.
Authors: Hemelings, Ruben; Elen, Bart; Schuster, Alexander K.; Blaschko, Matthew B.; Barbosa-Breda, João; Hujanen, Pekko; Junglas, Annika; Nickels, Stefan; White, Andrew; Pfeiffer, Norbert; Mitchell, Paul; De Boever, Patrick; Tuulonen, Anja; Stalmans, Ingeborg
Abstract: A plethora of classification models for the detection of glaucoma from fundus images have been proposed in recent years. Often trained with data from a single glaucoma clinic, they report impressive performance on internal test sets, but tend to struggle in generalizing to external sets. This performance drop can be attributed to data shifts in glaucoma prevalence, fundus camera, and the definition of glaucoma ground truth. In this study, we confirm that a previously described regression network for glaucoma referral (G-RISK) obtains excellent results in a variety of challenging settings. Thirteen different data sources of labeled fundus images were utilized. The data sources include two large population cohorts (Australian Blue Mountains Eye Study, BMES and German Gutenberg Health Study, GHS) and 11 publicly available datasets (AIROGS, ORIGA, REFUGE1, LAG, ODIR, REFUGE2, GAMMA, RIM-ONEr3, RIM-ONE DL, ACRIMA, PAPILA). To minimize data shifts in input data, a standardized image processing strategy was developed to obtain 30° disc-centered images from the original data. A total of 149,455 images were included for model testing. Area under the receiver operating characteristic curve (AUC) for BMES and GHS population cohorts were at 0.976 [95% CI: 0.967–0.986] and 0.984 [95% CI: 0.980–0.991] on participant level, respectively. At a fixed specificity of 95%, sensitivities were at 87.3% and 90.3%, respectively, surpassing the minimum criteria of 85% sensitivity recommended by Prevent Blindness America. AUC values on the eleven publicly available data sets ranged from 0.854 to 0.988. These results confirm the excellent generalizability of a glaucoma risk regression model trained with homogeneous data from a single tertiary referral center. Further validation using prospective cohort studies is warranted.
Subjects: GLAUCOMA diagnosis; DEEP learning; RETINA; CONFIDENCE intervals; ARTIFICIAL intelligence; MEDICAL screening; RETROSPECTIVE studies; AUTOMATION; DESCRIPTIVE statistics; PREDICTION models; DATA analysis software; LOGISTIC regression analysis
Publication: NPJ Digital Medicine, 2023, Vol 6, Issue 1, p1
ISSN: 2398-6352
Publication type: Article
DOI: 10.1038/s41746-023-00857-0

We found a match

A generalizable deep learning regression model for automated glaucoma screening from fundus images.

Hemelings, Ruben; Elen, Bart; Schuster, Alexander K.; Blaschko, Matthew B.; Barbosa-Breda, João; Hujanen, Pekko; Junglas, Annika; Nickels, Stefan; White, Andrew; Pfeiffer, Norbert; Mitchell, Paul; De Boever, Patrick; Tuulonen, Anja; Stalmans, Ingeborg

GLAUCOMA diagnosis; DEEP learning; RETINA; CONFIDENCE intervals; ARTIFICIAL intelligence; MEDICAL screening; RETROSPECTIVE studies; AUTOMATION; DESCRIPTIVE statistics; PREDICTION models; DATA analysis software; LOGISTIC regression analysis

NPJ Digital Medicine, 2023, Vol 6, Issue 1, p1

2398-6352

Article

10.1038/s41746-023-00857-0