We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
Reducing storage requirements for biological sequence comparison.
- Authors
Roberts, Michael; Hayes, Wayne; Hunt, Brian R; Mount, Stephen M; Yorke, James A
- Abstract
Comparison of nucleic acid and protein sequences is a fundamental tool of modern bioinformatics. A dominant method of such string matching is the 'seed-and-extend' approach, in which occurrences of short subsequences called 'seeds' are used to search for potentially longer matches in a large database of sequences. Each such potential match is then checked to see if it extends beyond the seed. To be effective, the seed-and-extend approach needs to catalogue seeds from virtually every substring in the database of search strings. Projects such as mammalian genome assemblies and large-scale protein matching, however, have such large sequence databases that the resulting list of seeds cannot be stored in RAM on a single computer. This significantly slows the matching process.
- Publication
Bioinformatics (Oxford, England), 2004, Vol 20, Issue 18, p3363
- ISSN
1367-4803
- Publication type
Journal Article
- DOI
10.1093/bioinformatics/bth408