We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
Personalized Driver Gene Prediction Using Graph Convolutional Networks with Conditional Random Fields.
- Authors
Wei, Pi-Jing; Zhu, An-Dong; Cao, Ruifen; Zheng, Chunhou
- Abstract
Simple Summary: Identifying cancer driver genes plays a significant role in cancer diagnosis and treatment. With the advancement of next-generation sequencing technologies, a wealth of multi-omics cancer data, including genomic, epigenomic, and transcriptomic data, are now available for cancer research. Integrating these data to effectively identify cancer driver genes causally associated with cancer is a computational challenge. Methods for identifying cancer driver genes are mainly based on population levels. Considering the trend of precision medicine and the heterogeneity of patients, it is challenging but crucial to identify cancer driver genes at the individual level. We developed a method called PDGCN (Personalized Drivers of GCN), which constructs sample–gene interaction networks by integrating multiple types of data features and using network structural features extracted from Node2vec. Then, a graphical convolutional neural network model with a conditional random field layer is used to prioritize candidate driver genes in the network. The results show that PDGCN can identify driver genes at the individual level, providing a new perspective for predicting driver genes in individual samples. Cancer is a complex and evolutionary disease mainly driven by the accumulation of genetic variations in genes. Identifying cancer driver genes is important. However, most related studies have focused on the population level. Cancer is a disease with high heterogeneity. Thus, the discovery of driver genes at the individual level is becoming more valuable but is a great challenge. Although there have been some computational methods proposed to tackle this challenge, few can cover all patient samples well, and there is still room for performance improvement. In this study, to identify individual-level driver genes more efficiently, we propose the PDGCN method. PDGCN integrates multiple types of data features, including mutation, expression, methylation, copy number data, and system-level gene features, along with network structural features extracted using Node2vec in order to construct a sample–gene interaction network. Prediction is performed using a graphical convolutional neural network model with a conditional random field layer, which is able to better combine the network structural features with biological attribute features. Experiments on the ACC (Adrenocortical Cancer) and KICH (Kidney Chromophobe) datasets from TCGA (The Cancer Genome Atlas) demonstrated that the method performs better compared to other similar methods. It can identify not only frequently mutated driver genes, but also rare candidate driver genes and novel biomarker genes. The results of the survival and enrichment analyses of these detected genes demonstrate that the method can identify important driver genes at the individual level.
- Subjects
RANDOM fields; MARKOV random fields; CANCER genes; GENE expression; GENETIC variation; GENES; GENE regulatory networks
- Publication
Biology (2079-7737), 2024, Vol 13, Issue 3, p184
- ISSN
2079-7737
- Publication type
Article
- DOI
10.3390/biology13030184