• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

De novo protein design. II. Plasticity in sequence space. (1999)

by P Koehl, M Levitt
Venue:J. Mol. Biol.
Add To MetaCart

Tools

Sorted by:
Results 1 - 9 of 9

Folding@Home and Genome@Home: Using distributed computing to tackle previously intractable problems in computational biology

by Stefan M. Larson, Christopher D. Snow, Michael Shirts, Vijay S. P, Vijay S. Pande
"... For decades, researchers have been applying computer simulation to address problems in biology. However, many of these grand challenges in computational biology, such as simulating how proteins fold, remained unsolved due to their great complexity. Indeed, even to simulate the fastest folding prot ..."
Abstract - Cited by 103 (0 self) - Add to MetaCart
For decades, researchers have been applying computer simulation to address problems in biology. However, many of these grand challenges in computational biology, such as simulating how proteins fold, remained unsolved due to their great complexity. Indeed, even to simulate the fastest folding protein would require decades on the fastest modern CPUs. Here, we review novel methods to fundamentally speed such previously intractable problems using a new computational paradigm: distributed computing. By efficiently harnessing tens of thousands of computers throughout the world, we have been able to break previous computational barriers. However, distributed computing brings new challenges, such as how to efficiently divide a complex calculation of many PCs that are connected by relatively slow networking. Moreover, even if the challenge of accurately reproducing reality can be conquered, a new challenge emerges: how can we take the results of these simulations (typically tens to hundreds of gigabytes of raw data) and gain some insight into the questions at hand. This challenge of the analysis of the sea of data resulting from large-scale simulation will likely remain for decades to come.

A Search for Energy Minimized Sequences of Proteins

by Anupam Nath Jha, G. K. Ananthasuresh, Saraswathi Vishveshwara , 2009
"... In this paper, we present numerical evidence that supports the notion of minimization in the sequence space of proteins for a target conformation. We use the conformations of the real proteins in the Protein Data Bank (PDB) and present computationally efficient methods to identify the sequences with ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
In this paper, we present numerical evidence that supports the notion of minimization in the sequence space of proteins for a target conformation. We use the conformations of the real proteins in the Protein Data Bank (PDB) and present computationally efficient methods to identify the sequences with minimum energy. We use edge-weighted connectivity graph for ranking the residue sites with reduced amino acid alphabet and then use continuous optimization to obtain the energy-minimizing sequences. Our methods enable the computation of a lower bound as well as a tight upper bound for the energy of a given conformation. We validate our results by using three different inter-residue energy matrices for five proteins from protein data bank (PDB), and by comparing our energy-minimizing sequences with 80 million diverse sequences that are generated based on different considerations in each case. When we submitted some of our chosen energy-minimizing sequences to Basic Local Alignment Search Tool (BLAST), we obtained some sequences from nonredundant protein sequence database that are similar to ours with an E-value of the order of 10-7. In summary, we conclude that proteins show a trend towards minimizing energy in the sequence space but do not seem to adopt the global energyminimizing sequence. The reason for this could be either that the existing energy matrices are not able to accurately represent the inter-residue interactions in the context of the protein environment or that Nature does not push the
(Show Context)

Citation Context

...ating sequences with better capabilities to fight diseases as shown by the enhanced the antimicrobial property of hbD2, a 41residue peptide [13]. Improved specificity is also possible as demonstrated =-=[14]-=- in the case of myoglobin family. De novo protein design necessarily requires a search in the sequence space. Although there is no guiding principle, many experimental and computational approaches hav...

What’s in a likelihood? Simple models of protein evolution and the contribution of structurally viable reconstructions to the likelihood

by Clemens Lakner , Mark T Holder , Nick Goldman , AND Gavin J P Naylor - Syst. Biol
"... Abstract.-Most phylogenetic models of protein evolution assume that sites are independent and identically distributed. Interactions between sites are ignored, and the likelihood can be conveniently calculated as the product of the individual site likelihoods. The calculation considers all possible ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
Abstract.-Most phylogenetic models of protein evolution assume that sites are independent and identically distributed. Interactions between sites are ignored, and the likelihood can be conveniently calculated as the product of the individual site likelihoods. The calculation considers all possible transition paths (also called substitution histories or mappings) that are consistent with the observed states at the terminals, and the probability density of any particular reconstruction depends on the substitution model. The likelihood is the integral of the probability density of each substitution history taken over all possible histories that are consistent with the observed data. We investigated the extent to which transition paths that are incompatible with a protein's three-dimensional structure contribute to the likelihood. Several empirical amino acid models were tested for sequence pairs of different degrees of divergence. When simulating substitutional histories starting from a real sequence, the structural integrity of the simulated sequences quickly disintegrated. This result indicates that simple models are clearly unable to capture the constraints on sequence evolution. However, when we sampled transition paths between real sequences from the posterior probability distribution according to these same models, we found that the sampled histories were largely consistent with the tertiary structure. This suggests that simple empirical substitution models may be adequate for interpolating changes between observed sequences during phylogenetic inference despite the fact that the models cannot predict the effects of structural constraints from first principles. This study is significant because it provides a quantitative assessment of the biological realism of substitution models from the perspective of protein structure, and it provides insight on the prospects for improving models of protein sequence evolution. [Ancestral state reconstruction; empirical amino acid models; maximum likelihood; phylogenetics; protein structure.]
(Show Context)

Citation Context

...edure should generate sequences that are compatible with the native conforTABLE 3. Branch lengths ν (expected number of substitutions per site) and likelihoods for the sequence pairs under different models Protein ν −lnL Poisson WAG LG Poisson WAG LG Parvalbumin A 0.54166 0.54033 0.58710 532.923 478.894 482.838 Myoglobin 0.46225 0.46228 0.49951 723.726 668.858 665.950 Lysozyme c 0.48017 0.51281 0.54236 619.853 594.483 601.541 HPPK 0.60856 0.66055 0.72945 791.087 741.261 735.662 mation (stability) and incompatible with other structures (specificity, for a detailed treatment of the subject, see Koehl and Levitt 1999a,b). If the sampled ancestral sequences are to be compatible with the structure of the wild types, they must fulfill these criteria as well. Three approaches were taken to evaluate the compatibility of the sampled sequences with the crystal structures. First, we used empirically derived contact potentials to assess sequence-structure compatibility. These so-called knowledge-based potentials were used to calculate the pseudo-energy (PE), which is tightly correlated with the free energy of the sequence in the final folded form. Second, in order to test for structural specificity, we estimated t...

Program of Study Committee:

by Yaping Feng, Yaping Feng, Amy Andreotti, Richard Honzatko, Xueyu Song, Zhijun Wu , 2008
"... New statistical potentials for improved protein structure prediction ..."
Abstract - Add to MetaCart
New statistical potentials for improved protein structure prediction

PROTEINS: Structure, Function, and Genetics 51:390–396 (2003) Increased Detection of Structural Templates Using Alignments of Designed Sequences

by Stefan M. Larson, Amit Garg, John R. Desjarlais, Vijay S. P
"... ABSTRACT Protein structure prediction by comparative modeling benefits greatly from the use of multiple sequence alignment information to improve the accuracy of structural template identification and the alignment of target sequences to structural templates. Unfortunately, this benefit is limited t ..."
Abstract - Add to MetaCart
ABSTRACT Protein structure prediction by comparative modeling benefits greatly from the use of multiple sequence alignment information to improve the accuracy of structural template identification and the alignment of target sequences to structural templates. Unfortunately, this benefit is limited to those protein sequences for which at least several natural sequence homologues exist. We show here that the use of large diverse alignments of computationally designed protein sequences confers many of the same benefits as natural sequences in identifying structural templates for comparative modeling targets. A large-scale massively parallelized application of an all-atom protein design algorithm, including a simple model of peptide backbone flexibility, has allowed us to generate 500 diverse, non-native, high-quality sequences for each of 264 protein structures in our test set. PSI-BLAST searches using the sequence profiles generated from the designed sequences (“reverse ” BLAST searches) give nearperfect accuracy in identifying true structural homologues of the parent structure, with 54 % coverage. In 41 of 49 genomes scanned using reverse BLAST searches, at least one novel structural template (not found by the standard method of PSI-BLAST against PDB) is identified. Further improvements in coverage, through optimizing the scoring function used to design sequences and continued application to new protein structures beyond the test set, will allow this method to mature into a useful strategy for identifying distantly related structural templates.

présentée et soutenue publiquement par

by Docteur De, Superieure Des, Mines De Paris, Anthony Benoist , 2010
"... Eléments d’adaptation de la méthodologie d’analyse de cycle de vie aux carburants végétaux: cas de la première génération ..."
Abstract - Add to MetaCart
Eléments d’adaptation de la méthodologie d’analyse de cycle de vie aux carburants végétaux: cas de la première génération
(Show Context)

Citation Context

...cher la structure optimale pour une séquence donnée. Ainsi, plutôt que d’explorer l’immense espace des conformations, cette approche permet de réduire le domaine de recherche à l’espace des séquences =-=[110]-=-, [118], [113], [170]. pastel-00003713, version 1 - 23 Jul 2010 La modélisation par homologie repose sur l’hypothèse suivante : si une séquence de structure inconnue est similaire à une séquence de st...

conformational entropy

by Daniele Sciretti, Pierpaolo Bruscolini, Ro Pelizzola, Marco Pretti, Alfonso Jaramillo
"... Computational protein design with side-chain ..."
Abstract - Add to MetaCart
Computational protein design with side-chain

By

by Sourav Rakshit , 2011
"... Contents ii ..."
Abstract - Add to MetaCart
Contents ii

design

by C. A. Floudas, H. K. Fung, S. R. Mcallister, M. Mönnigmann, R. Rajgaria , 2005
"... in protein structure prediction and de novo protein ..."
Abstract - Add to MetaCart
in protein structure prediction and de novo protein
(Show Context)

Citation Context

... approach for protein design guaranteed the specificity of the designed sequence for the template by fixing the amino acid composition, and they proved this new procedure converged in sequence space (=-=Koehl and Levitt, 1999-=-b). The ultimate goal of computational protein design is of course not just to achieve the desired structure but also to render specific functions or properties to the novel protein. In the latter res...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University