Peng, Le; Luo, Gaoxiang; Walker, Andrew; Zaiman, Zachary; Jones, Emma K; Gupta, Hemant; Kersten, Kristopher; Burns, John L; Harle, Christopher A; Magoc, Tanja; Shickel, Benjamin; Steenburg, Scott D; Loftus, Tyler; Melton, Genevieve B; Gichoya, Judy Wawira; Sun, Ju; Tignanelli, Christopher J

doi:10.1093/jamia/ocac188

Back to matches

Your institution may have rights to this item. Sign in to continue.

Title: Evaluation of Federated Learning Variations for COVID-19 diagnosis using Chest Radiographs from 42 US and European hospitals.
Authors: Peng, Le; Luo, Gaoxiang; Walker, Andrew; Zaiman, Zachary; Jones, Emma K; Gupta, Hemant; Kersten, Kristopher; Burns, John L; Harle, Christopher A; Magoc, Tanja; Shickel, Benjamin; Steenburg, Scott D; Loftus, Tyler; Melton, Genevieve B; Gichoya, Judy Wawira; Sun, Ju; Tignanelli, Christopher J
Abstract: <bold>Objective: </bold>Federated learning (FL) allows multiple distributed data holders to collaboratively learn a shared model without data sharing. However, individual health system data are heterogeneous. "Personalized" FL variations have been developed to counter data heterogeneity, but few have been evaluated using real-world healthcare data. The purpose of this study is to investigate the performance of a single-site versus a 3-client federated model using a previously described COVID-19 diagnostic model. Additionally, to investigate the effect of system heterogeneity, we evaluate the performance of 4 FL variations.<bold>Materials and Methods: </bold>We leverage a FL healthcare collaborative including data from 5 international healthcare systems (US and Europe) encompassing 42 hospitals. We implemented a COVID-19 computer vision diagnosis system using the FedAvg algorithm implemented on Clara Train SDK 4.0. To study the effect of data heterogeneity, training data was pooled from 3 systems locally and federation was simulated. We compared a centralized/pooled model, versus FedAvg, and 3 personalized FL variations (FedProx, FedBN, FedAMP).<bold>Results: </bold>We observed comparable model performance with respect to internal validation (local model: AUROC 0.94 vs FedAvg: 0.95, p = 0.5) and improved model generalizability with the FedAvg model (p < 0.05). When investigating the effects of model heterogeneity, we observed poor performance with FedAvg on internal validation as compared to personalized FL algorithms. FedAvg did have improved generalizability compared to personalized FL algorithms. On average, FedBN had the best rank performance on internal and external validation.<bold>Conclusion: </bold>FedAvg can significantly improve the generalization of the model compared to other personalization FL algorithms; however, at the cost of poor internal validity. Personalized FL may offer an opportunity to develop both internal and externally validated algorithms.
Subjects: EUROPE; COVID-19; COVID-19 testing; HOSPITALS; COMPUTER vision
Publication: Journal of the American Medical Informatics Association, 2023, Vol 30, Issue 1, p54
ISSN: 1067-5027
Publication type: Article
DOI: 10.1093/jamia/ocac188

We found a match

Evaluation of Federated Learning Variations for COVID-19 diagnosis using Chest Radiographs from 42 US and European hospitals.

Peng, Le; Luo, Gaoxiang; Walker, Andrew; Zaiman, Zachary; Jones, Emma K; Gupta, Hemant; Kersten, Kristopher; Burns, John L; Harle, Christopher A; Magoc, Tanja; Shickel, Benjamin; Steenburg, Scott D; Loftus, Tyler; Melton, Genevieve B; Gichoya, Judy Wawira; Sun, Ju; Tignanelli, Christopher J

EUROPE; COVID-19; COVID-19 testing; HOSPITALS; COMPUTER vision

Journal of the American Medical Informatics Association, 2023, Vol 30, Issue 1, p54

1067-5027

Article

10.1093/jamia/ocac188