We found a match
Your institution may have rights to this item. Sign in to continue.
- Title
Grid-Assembly: An oligonucleotide composition-based partitioning strategy to aid metagenomic sequence assembly.
- Authors
Ghosh, Tarini Shankar; Mehra, Varun; Mande, Sharmila S.
- Abstract
Metagenomics approach involves extraction, sequencing and characterization of the genomic content of entire community of microbes present in a given environment. In contrast to genomic data, accurate assembly of metagenomic sequences is a challenging task. Given the huge volume and the diverse taxonomic origin of metagenomic sequences, direct application of single genome assembly methods on metagenomes are likely to not only lead to an immense increase in requirements of computational infrastructure, but also result in the formation of chimeric contigs. A strategy to address the above challenge would be to partition metagenomic sequence datasets into clusters and assemble separately the sequences in individual clusters using any single-genome assembly method. The current study presents such an approach that uses tetranucleotide usage patterns to first represent sequences as points in a three dimensional (3D) space. The 3D space is subsequently partitioned into "Grids". Sequences within overlapping grids are then progressively assembled using any available assembler. We demonstrate the applicability of the current Grid-Assembly method using various categories of assemblers as well as different simulated metagenomic datasets. Validation results indicate that the Grid-Assembly approach helps in improving the overall quality of assembly, in terms of the purity and volume of the assembled contigs.
- Subjects
GRID computing; OLIGONUCLEOTIDES; METAGENOMICS; CHIMERIC proteins; COMPUTER simulation; NUCLEOTIDE sequence
- Publication
Journal of Bioinformatics & Computational Biology, 2015, Vol 13, Issue 3, p-1
- ISSN
0219-7200
- Publication type
Article
- DOI
10.1142/S0219720015410048