Chinn, Erin; Arora, Rohit; Arnaout, Ramy; Arnaout, Rima

doi:10.1093/jamia/ocad055

Back to matches

Your institution may have access to this item. Find your institution then sign in to continue.

Title: ENRICHing medical imaging training sets enables more efficient machine learning.
Authors: Chinn, Erin; Arora, Rohit; Arnaout, Ramy; Arnaout, Rima
Abstract: Objective Deep learning (DL) has been applied in proofs of concept across biomedical imaging, including across modalities and medical specialties. Labeled data are critical to training and testing DL models, but human expert labelers are limited. In addition, DL traditionally requires copious training data, which is computationally expensive to process and iterate over. Consequently, it is useful to prioritize using those images that are most likely to improve a model's performance, a practice known as instance selection. The challenge is determining how best to prioritize. It is natural to prefer straightforward, robust, quantitative metrics as the basis for prioritization for instance selection. However, in current practice, such metrics are not tailored to, and almost never used for, image datasets. Materials and Methods To address this problem, we introduce ENRICH— E liminate N oise and R edundancy for I maging Ch allenges—a customizable method that prioritizes images based on how much diversity each image adds to the training set. Results First, we show that medical datasets are special in that in general each image adds less diversity than in nonmedical datasets. Next, we demonstrate that ENRICH achieves nearly maximal performance on classification and segmentation tasks on several medical image datasets using only a fraction of the available images and without up-front data labeling. ENRICH outperforms random image selection, the negative control. Finally, we show that ENRICH can also be used to identify errors and outliers in imaging datasets. Conclusions ENRICH is a simple, computationally efficient method for prioritizing images for expert labeling and use in DL.
Subjects: COMPUTER-assisted image analysis (Medicine); DIAGNOSTIC imaging; IMAGE segmentation; PROOF of concept; DEEP learning; MEDICAL specialties &; specialists; MACHINE learning
Publication: Journal of the American Medical Informatics Association, 2023, Vol 30, Issue 6, p1079
ISSN: 1067-5027
Publication type: Article
DOI: 10.1093/jamia/ocad055

We found a match

ENRICHing medical imaging training sets enables more efficient machine learning.

Chinn, Erin; Arora, Rohit; Arnaout, Ramy; Arnaout, Rima

COMPUTER-assisted image analysis (Medicine); DIAGNOSTIC imaging; IMAGE segmentation; PROOF of concept; DEEP learning; MEDICAL specialties &; specialists; MACHINE learning

Journal of the American Medical Informatics Association, 2023, Vol 30, Issue 6, p1079

1067-5027

Article

10.1093/jamia/ocad055