Can computationally designed protein sequences improve secondary structure prediction?

No Thumbnail Available
Date
2011-04-18
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Computational sequence design methods are used to engineer proteins with desired properties such as increased thermal stability and novel function In addition these algorithms can be used to identify an envelope of sequences that may be compatible with a particular protein fold topology In this regard we hypothesized that sequence property prediction specifically secondary structure could be significantly enhanced by using a large database of computationally designed sequences We performed a large scale test of this hypothesis with 6511 diverse protein domains and 50 designed sequences per domain After analysis of the inherent accuracy of the designed sequences database we realized that it was necessary to put constraints on what fraction of the native sequence should be allowed to change With mutational constraints accuracy was improved vs no constraints but the diversity of designed sequences and hence effective size of the database was moderately reduced Overall the best three state prediction accuracy Q 3 that we achieved was nearly a percentage point improved over using a natural sequence database alone well below the theoretical possibility for improvement of 8 10 percentage points Furthermore our nascent method was used to augment the state of the art PSIPRED program by a percentage point
Description
Keywords
Citation
Collections