Chen, Sijie; Chen, Yixin; Sun, Fengzhu; Waterman, Michael S; Zhang, Xuegong

doi:10.1093/bioinformatics/btz262

Back to matches

Your institution may have access to this item. Find your institution then sign in to continue.

Title: A new statistic for efficient detection of repetitive sequences.
Authors: Chen, Sijie; Chen, Yixin; Sun, Fengzhu; Waterman, Michael S; Zhang, Xuegong
Abstract: Motivation Detecting sequences containing repetitive regions is a basic bioinformatics task with many applications. Several methods have been developed for various types of repeat detection tasks. An efficient generic method for detecting most types of repetitive sequences is still desirable. Inspired by the excellent properties and successful applications of the D 2 family of statistics in comparative analyses of genomic sequences, we developed a new statistic D 2 R that can efficiently discriminate sequences with or without repetitive regions. Results Using the statistic, we developed an algorithm of linear time and space complexity for detecting most types of repetitive sequences in multiple scenarios, including finding candidate clustered regularly interspaced short palindromic repeats regions from bacterial genomic or metagenomics sequences. Simulation and real data experiments show that the method works well on both assembled sequences and unassembled short reads. Availability and implementation The codes are available at https://github.com/XuegongLab/D2R%5fcodes under GPL 3.0 license. Supplementary information Supplementary data are available at Bioinformatics online.
Subjects: VECTOR spaces; SPACETIME; SEQUENCE analysis; COMPARATIVE studies
Publication: Bioinformatics, 2019, Vol 35, Issue 22, p4596
ISSN: 1367-4803
Publication type: Article
DOI: 10.1093/bioinformatics/btz262

We found a match

A new statistic for efficient detection of repetitive sequences.

Chen, Sijie; Chen, Yixin; Sun, Fengzhu; Waterman, Michael S; Zhang, Xuegong

VECTOR spaces; SPACETIME; SEQUENCE analysis; COMPARATIVE studies

Bioinformatics, 2019, Vol 35, Issue 22, p4596

1367-4803

Article

10.1093/bioinformatics/btz262