Abstract
Current biological sequence comparison tools frequently fail to recognize matches between homologs when sequence similarity is below the twilight zone of less than 25% sequence identity. By combining sequence properties and position specific scoring matrices, improved accuracy in remote homology detection is realized. This paper extends the work of Propsearch, a sequence-property-based approach to sequence searching, by incorporating a population adaptive genetic algorithm that makes use of position specific scoring matrices in feature calculation. Optimized feature weights are obtained by training a genetic algorithm and used to find homologs to a query sequence. Databases with less than 10%, 20%, and 30% sequence similarity are used to test the remote homology detector. Comparisons are made between the optimized remote homology detector and other sequence similarity programs in both accuracy and time complexity. Future considerations for position specific scoring matrices based on the original genetic algorithm are also proposed.
Original language | American English |
---|---|
State | Published - Jul 1 2009 |
Event | Proceedings of the International Conference on Bioinformatics & Computational Biology - Duration: Jul 1 2009 → … |
Conference
Conference | Proceedings of the International Conference on Bioinformatics & Computational Biology |
---|---|
Period | 7/1/09 → … |
Keywords
- Protein Homology
- Remote Homologs
- Sequence Search
Disciplines
- Bioinformatics
- Communication
- Communication Technology and New Media
- Computer Sciences
- Databases and Information Systems
- Life Sciences
- OS and Networks
- Physical Sciences and Mathematics
- Science and Technology Studies
- Social and Behavioral Sciences