Genetic Programming for Improved Data Mining: Application to the Biochemistry of Protein Interactions

Michael L. Raymer, William F. Punch, Erik D. Goodman, Leslie A. Kuhn

Research output: Contribution to conferencePresentation

Abstract

We have previously shown how a genetic algorithm (GA) can be used to perform "data mining," the discovery of particular/important data within large datasets, by finding optimal data classifications using known examples. However, these approaches, while successful, limited data relationships to those that were "fixed" before the GA run. We report here on an extension of our previous work, substituting a genetic program (GP) for a GA. The GP could optimize data classification, as did the GA, but could also determine the functional relationships among the features. This gave improved performance and new information on important relationships among features. We discuss the overall approach, and compare the effectiveness of the GA vs. GP on a biochemistry problem, the determination of the involvement of bound water molecules in protein interactions.

Original languageAmerican English
StatePublished - Jan 1 1996
EventProceedings of the First Annual Conference on Genetic Programming -
Duration: Jan 1 1996 → …

Conference

ConferenceProceedings of the First Annual Conference on Genetic Programming
Period1/1/96 → …

Disciplines

  • Bioinformatics
  • Communication
  • Communication Technology and New Media
  • Computer Sciences
  • Databases and Information Systems
  • Life Sciences
  • OS and Networks
  • Physical Sciences and Mathematics
  • Science and Technology Studies
  • Social and Behavioral Sciences

Cite this