We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
Effect of Dataset Size and Medical Image Modality on Convolutional Neural Network Model Performance for Automated Segmentation: A CT and MR Renal Tumor Imaging Study.
- Authors
Gottlich, Harrison C.; Gregory, Adriana V.; Sharma, Vidit; Khanna, Abhinav; Moustafa, Amr U.; Lohse, Christine M.; Potretzke, Theodora A.; Korfiatis, Panagiotis; Potretzke, Aaron M.; Denic, Aleksandar; Rule, Andrew D.; Takahashi, Naoki; Erickson, Bradley J.; Leibovich, Bradley C.; Kline, Timothy L.
- Abstract
The aim of this study is to investigate the use of an exponential-plateau model to determine the required training dataset size that yields the maximum medical image segmentation performance. CT and MR images of patients with renal tumors acquired between 1997 and 2017 were retrospectively collected from our nephrectomy registry. Modality-based datasets of 50, 100, 150, 200, 250, and 300 images were assembled to train models with an 80–20 training-validation split evaluated against 50 randomly held out test set images. A third experiment using the KiTS21 dataset was also used to explore the effects of different model architectures. Exponential-plateau models were used to establish the relationship of dataset size to model generalizability performance. For segmenting non-neoplastic kidney regions on CT and MR imaging, our model yielded test Dice score plateaus of 0.93 ± 0.02 and 0.92 ± 0.04 with the number of training-validation images needed to reach the plateaus of 54 and 122, respectively. For segmenting CT and MR tumor regions, we modeled a test Dice score plateau of 0.85 ± 0.20 and 0.76 ± 0.27 , with 125 and 389 training-validation images needed to reach the plateaus. For the KiTS21 dataset, the best Dice score plateaus for nn-UNet 2D and 3D architectures were 0.67 ± 0.29 and 0.84 ± 0.18 with number to reach performance plateau of 177 and 440. Our research validates that differing imaging modalities, target structures, and model architectures all affect the amount of training images required to reach a performance plateau. The modeling approach we developed will help future researchers determine for their experiments when additional training-validation images will likely not further improve model performance.
- Subjects
REPORTING of diseases; MAGNETIC resonance imaging; ACQUISITION of data; RETROSPECTIVE studies; KIDNEY tumors; MEDICAL records; COMPUTED tomography; ARTIFICIAL neural networks
- Publication
Journal of Digital Imaging, 2023, Vol 36, Issue 4, p1770
- ISSN
0897-1889
- Publication type
Article
- DOI
10.1007/s10278-023-00804-1