Indexing Genomic Databases

Gina Cooper, Michael L. Raymer, Travis E. Doom, Dan E. Krane, Natsuhiko Futamura

Research output: Contribution to conferencePoster

Abstract

Current biological sequence comparison tools utilize full database searches to find approximate matches between a database and a query. A new approach to sequence comparisons can be performed by indexing the database using a novel indexing scheme. An indexed scheme can immediately eliminate highly mismatched sequences thereby improving performance and accuracy. iBlast is proposed as an indexed version of BLAST. In its initial implementation, iBlast uses a sequence-based index to catalog genomic databases in an NCR Teradata RDBMS. Several types of indexes and querying methods are explored to determine the most efficient solution utilizing the parallel nature of the Teradata system. Significant speedups were obtained and are explained in further detail in this paper. Future indexing methods based on prokaryotic and eukaryotic genome structures are also proposed.

Original languageAmerican English
DOIs
StatePublished - May 1 2004
EventProceedings of the Fourth IEEE Symposium on Bioinformatics and Bioengineering -
Duration: May 1 2004 → …

Conference

ConferenceProceedings of the Fourth IEEE Symposium on Bioinformatics and Bioengineering
Period5/1/04 → …

Disciplines

  • Bioinformatics
  • Communication
  • Communication Technology and New Media
  • Computer Sciences
  • Databases and Information Systems
  • Life Sciences
  • OS and Networks
  • Physical Sciences and Mathematics
  • Science and Technology Studies
  • Social and Behavioral Sciences

Cite this