We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
Capturing cell type-specific chromatin compartment patterns by applying topic modeling to single-cell Hi-C data.
- Authors
Kim, Hyeon-Jin; Yardımcı, Galip Gurkan; Bonora, Giancarlo; Ramani, Vijay; Liu, Jie; Qiu, Ruolan; Lee, Choli; Hesson, Jennifer; Ware, Carol B.; Shendure, Jay; Duan, Zhijun; Noble, William Stafford
- Abstract
Single-cell Hi-C (scHi-C) interrogates genome-wide chromatin interaction in individual cells, allowing us to gain insights into 3D genome organization. However, the extremely sparse nature of scHi-C data poses a significant barrier to analysis, limiting our ability to tease out hidden biological information. In this work, we approach this problem by applying topic modeling to scHi-C data. Topic modeling is well-suited for discovering latent topics in a collection of discrete data. For our analysis, we generate nine different single-cell combinatorial indexed Hi-C (sci-Hi-C) libraries from five human cell lines (GM12878, H1Esc, HFF, IMR90, and HAP1), consisting over 19,000 cells. We demonstrate that topic modeling is able to successfully capture cell type differences from sci-Hi-C data in the form of "chromatin topics." We further show enrichment of particular compartment structures associated with locus pairs in these topics. Author summary: The genomes of higher organisms are intricately folded and organized in a dynamic manner that has strong implications for many biological processes. Each chromosome undergoes dramatic changes to their three dimensional conformation during the cell cycle, whereas the positioning of chromosomes within the nucleus plays an important role in controlling the activation of specific genes. Recently, it has become possible to investigate the 3D conformations of the genomes of individual cells using a high throughput sequencing assay called single cell Hi-C (scHi-C). However, data from these assays are sparse and noisy, making analysis and interpretation of scHi-C data challenging. In this work, we generated a scHi-C dataset of over 19,000 cells from five human cell lines and applied a natural language processing method called topic modeling to discover cell type-specific "chromatin" topics. We show that these topics can be used to distinguish between cells at different stages of the cell cycle and cells from different tissues based on the 3D conformation of their genomes, despite the sparsity of the data. We further show that the 3D conformations of single cells are linked to the expression of cell type-specific genes and to cell cycle-associated conformational patterns.
- Subjects
NATURAL language processing; CELL cycle; CELLS
- Publication
PLoS Computational Biology, 2020, Vol 16, Issue 9, p1
- ISSN
1553-734X
- Publication type
Article
- DOI
10.1371/journal.pcbi.1008173