We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
Modeling the sequence dependence of differential antibody binding in the immune response to infectious disease.
- Authors
Chowdhury, Robayet; Taguchi, Alexander T.; Kelbauskas, Laimonas; Stafford, Phillip; Diehnelt, Chris; Zhao, Zhan-Gong; Williamson, Phillip C.; Green, Valerie; Woodbury, Neal W.
- Abstract
Past studies have shown that incubation of human serum samples on high density peptide arrays followed by measurement of total antibody bound to each peptide sequence allows detection and discrimination of humoral immune responses to a variety of infectious diseases. This is true even though these arrays consist of peptides with near-random amino acid sequences that were not designed to mimic biological antigens. This "immunosignature" approach, is based on a statistical evaluation of the binding pattern for each sample but it ignores the information contained in the amino acid sequences that the antibodies are binding to. Here, similar array-based antibody profiles are instead used to train a neural network to model the sequence dependence of molecular recognition involved in the immune response of each sample. The binding profiles used resulted from incubating serum from 5 infectious disease cohorts (Hepatitis B and C, Dengue Fever, West Nile Virus and Chagas disease) and an uninfected cohort with 122,926 peptide sequences on an array. These sequences were selected quasi-randomly to represent an even but sparse sample of the entire possible combinatorial sequence space (~1012). This very sparse sampling of combinatorial sequence space was sufficient to capture a statistically accurate representation of the humoral immune response across the entire space. Processing array data using the neural network not only captures the disease-specific sequence-binding information but aggregates binding information with respect to sequence, removing sequence-independent noise and improving the accuracy of array-based classification of disease compared with the raw binding data. Because the neural network model is trained on all samples simultaneously, a highly condensed representation of the differential information between samples resides in the output layer of the model, and the column vectors from this layer can be used to represent each sample for classification or unsupervised clustering applications. Author summary: Previous studies have shown that it is possible to use high density arrays of near-random peptide sequences as a general, disease agnostic approach to diagnosis by analyzing the pattern of antibody binding in serum to the array. The current approach replaces purely statistical pattern recognition with a machine learning-based approach that comprehensively describes the binding of antibodies to specific amino acid sequences. As one use case, this can be implemented to substantially enhance the diagnostic power of these peptide array-based antibody profiles by incorporating the sequence information with the measured antibody binding to better detect and discriminate infectious diseases. This makes the array analysis much more robust to noise and provides a means of condensing the disease differentiating information from the array into a compact form that can be readily used for disease classification, unsupervised clustering or population health monitoring.
- Subjects
WEST Nile fever; IMMUNE response; COMMUNICABLE diseases; PATTERN recognition systems; AMINO acid sequence; MACHINE learning
- Publication
PLoS Computational Biology, 2023, Vol 19, Issue 6, p1
- ISSN
1553-734X
- Publication type
Article
- DOI
10.1371/journal.pcbi.1010773