Improving Remote Homology Detection Using Sequence Properties and Position Specific Scoring Matrices

Gina Cooper, Michael L. Raymer

Research output: Contribution to conferencePresentation

Abstract

Current biological sequence comparison tools frequently fail to recognize matches between homologs when sequence similarity is below the twilight zone of less than 25% sequence identity. By combining sequence properties and position specific scoring matrices, improved accuracy in remote homology detection is realized. This paper extends the work of Propsearch, a sequence-property-based approach to sequence searching, by incorporating a population adaptive genetic algorithm that makes use of position specific scoring matrices in feature calculation. Optimized feature weights are obtained by training a genetic algorithm and used to find homologs to a query sequence. Databases with less than 10%, 20%, and 30% sequence similarity are used to test the remote homology detector. Comparisons are made between the optimized remote homology detector and other sequence similarity programs in both accuracy and time complexity. Future considerations for position specific scoring matrices based on the original genetic algorithm are also proposed.

Original languageAmerican English
StatePublished - Jul 1 2009
EventProceedings of the International Conference on Bioinformatics & Computational Biology -
Duration: Jul 1 2009 → …

Conference

ConferenceProceedings of the International Conference on Bioinformatics & Computational Biology
Period7/1/09 → …

Keywords

  • Protein Homology
  • Remote Homologs
  • Sequence Search

Disciplines

  • Bioinformatics
  • Communication
  • Communication Technology and New Media
  • Computer Sciences
  • Databases and Information Systems
  • Life Sciences
  • OS and Networks
  • Physical Sciences and Mathematics
  • Science and Technology Studies
  • Social and Behavioral Sciences

Cite this