We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
CEMIG: prediction of the cis-regulatory motif using the de Bruijn graph from ATAC-seq.
- Authors
Wang, Yizhong; Li, Yang; Wang, Cankun; Lio, Chan-Wang Jerry; Ma, Qin; Liu, Bingqiang
- Abstract
Sequence motif discovery algorithms enhance the identification of novel deoxyribonucleic acid sequences with pivotal biological significance, especially transcription factor (TF)-binding motifs. The advent of assay for transposase-accessible chromatin using sequencing (ATAC-seq) has broadened the toolkit for motif characterization. Nonetheless, prevailing computational approaches have focused on delineating TF-binding footprints, with motif discovery receiving less attention. Herein, we present Cis rEgulatory Motif Influence using de Bruijn Graph (CEMIG), an algorithm leveraging de Bruijn and Hamming distance graph paradigms to predict and map motif sites. Assessment on 129 ATAC-seq datasets from the Cistrome Data Browser demonstrates CEMIG's exceptional performance, surpassing three established methodologies on four evaluative metrics. CEMIG accurately identifies both cell-type-specific and common TF motifs within GM12878 and K562 cell lines, demonstrating its comparative genomic capabilities in the identification of evolutionary conservation and cell-type specificity. In-depth transcriptional and functional genomic studies have validated the functional relevance of CEMIG-identified motifs across various cell types. CEMIG is available at https://github.com/OSU-BMBL/CEMIG , developed in C++ to ensure cross-platform compatibility with Linux, macOS and Windows operating systems.
- Subjects
DE Bruijn graph; DNA; HAMMING distance; TRANSCRIPTION factors; GRAPH theory; CHROMATIN; LINUX operating systems
- Publication
Briefings in Bioinformatics, 2024, Vol 25, Issue 1, p1
- ISSN
1467-5463
- Publication type
Article
- DOI
10.1093/bib/bbad505