Results 1  10
of
22
Sali A: Statistical potentials for fold assessment
 Protein Sci 2002
"... A protein structure model generally needs to be evaluated to assess whether or not it has the correct fold. To improve fold assessment, four types of a residuelevel statistical potential were optimized, including distancedependent, contact, �/ � dihedral angle, and accessible surface statistical p ..."
Abstract

Cited by 56 (16 self)
 Add to MetaCart
A protein structure model generally needs to be evaluated to assess whether or not it has the correct fold. To improve fold assessment, four types of a residuelevel statistical potential were optimized, including distancedependent, contact, �/ � dihedral angle, and accessible surface statistical potentials. Approximately 10,000 test models with the correct and incorrect folds were built by automated comparative modeling of protein sequences of known structure. The criterion used to discriminate between the correct and incorrect models was the Zscore of the model energy. The performance of a Zscore was determined as a function of many variables in the derivation and use of the corresponding statistical potential. The performance was measured by the fractions of the correctly and incorrectly assessed test models. The most discriminating combination of any one of the four tested potentials is the sum of the normalized distancedependent and accessible surface potentials. The distancedependent potential that is optimal for assessing models of all sizes uses both C � and C � atoms as interaction centers, distinguishes between all 20 standard residue types, has the distance range of 30 Å, and is derived and used by taking into account the sequence separation of the interacting atom pairs. The terms for the sequentially local interactions are significantly less informative than those for the sequentially nonlocal interactions. The accessible surface potential that
Protein design: a perspective from simple tractable models
 Design
, 1998
"... Recent progress in computational approaches to protein design builds on advances in statistical mechanical protein folding theory. Here, the number of sequences folding into a given conformation is evaluated and a simple condition for a protein model’s designability is outlined. ..."
Abstract

Cited by 25 (0 self)
 Add to MetaCart
(Show Context)
Recent progress in computational approaches to protein design builds on advances in statistical mechanical protein folding theory. Here, the number of sequences folding into a given conformation is evaluated and a simple condition for a protein model’s designability is outlined.
The diverse functions of
 Dot1 and H3K79 methylation. Genes Dev;25:13451358
"... design of mechanical products based on behaviordriven functionenvironmentstructure modeling framework ..."
Abstract

Cited by 16 (1 self)
 Add to MetaCart
(Show Context)
design of mechanical products based on behaviordriven functionenvironmentstructure modeling framework
Influence of protein structure databases on the predictive power of statistical pair potentials. Proteins: Struct Func Genet 1998;31:139149
 Proteins: Structure, Function, and Genetics
, 1998
"... ABSTRACT A long standing goal in protein structure studies is the development of reliable energy functions that can be used both to verify protein models derived from experimental constraints as well as for theoretical protein folding and inverse folding computer experiments. In that respect, knowle ..."
Abstract

Cited by 15 (1 self)
 Add to MetaCart
(Show Context)
ABSTRACT A long standing goal in protein structure studies is the development of reliable energy functions that can be used both to verify protein models derived from experimental constraints as well as for theoretical protein folding and inverse folding computer experiments. In that respect, knowledgebased statistical pair potentials have attracted considerable interests recently mainly because they include the essential features of protein structures as well as solvent effects at a low computing cost. However, the basis on which statistical potentials are derived have been questioned. In this paper, we investigate statistical pair potentials derived from protein threedimensional structures, addressing in particular questions related to the form of these potentials, as well as to the content of the database from which they are derived. We have shown that statistical pair potentials depend on the size of the proteins included in the database, and that this dependence can be reduced by considering only pairs of residue close in space (i.e., with a cutoff of 8 A ˚). We have shown also that statistical potentials carry a memory of the quality of the database in terms of the amount and diversity of secondary structure it contains. We find, for example, that potentials derived from a database containing �proteins will only perform best on �proteins in fold recognition computer experiments. We believe that this is an overall weakness of these potentials, which must be kept in mind when constructing a database. Proteins 31:139–149, 1998. � 1998 WileyLiss, Inc. Key words: protein structure; statistical potentials; protein structure database; assessing protein models
Improving Protein Structure Prediction with ModelBased Search
 BIOINFORMATICS
, 2005
"... Motivation: De novo protein structure prediction can be formulated as search in a highdimensional space. One of the most frequently used computational tools to solve such search problems is the Monte Carlo method. We present a novel search technique, called modelbased search. This method samples t ..."
Abstract

Cited by 15 (3 self)
 Add to MetaCart
Motivation: De novo protein structure prediction can be formulated as search in a highdimensional space. One of the most frequently used computational tools to solve such search problems is the Monte Carlo method. We present a novel search technique, called modelbased search. This method samples the highdimensional search space to build an approximate model of the underlying function. This model is incrementally refined in areas of interest, while areas that are not of interest are excluded from further exploration. Modelbased search derives its efficiency from the fact that the information obtained during the exploration of the search space is used to guide further exploration. In contrast, Monte Carlobased techniques are memoryless and exploration is performed based on random walks, ignoring the information obtained in previous steps. Results: Modelbased search is applied to protein structure prediction, where search is employed to find the global minimum of the protein’s energy landscape. We show that modelbased search uses computational resources more efficiently to find lowerenergy conformations of proteins when compared to one of the leading protein structure prediction methods, which relies on a tailored Monte Carlo method to perform search. The performance improvements become more pronounced as the dimensionality of the search space increases. We show that modelbased search enables more accurate protein structure prediction than previously possible. Furthermore, we believe that similar performance improvements can be expected in other problems that are currently solved using Monte Carlobased search methods. Availability: An implementation of modelbased search can be obtained by contacting the authors.
Energy Strain in ThreeDimensional Protein Structures
, 1998
"... Introduction Identification of strain in protein threedimensional structures has many important implications for both experimental structure determination and theoretical modeling and design. The term `steric strain' usually describes unfavorable or disallowed conformations or structural abno ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
Introduction Identification of strain in protein threedimensional structures has many important implications for both experimental structure determination and theoretical modeling and design. The term `steric strain' usually describes unfavorable or disallowed conformations or structural abnormalities of the amino acid residues that are detected by analysis of f/y maps, van der Waals clashes, etc. Highresolution Xray crystal structures from the PDB [1] were analyzed [2,3] and it was shown that some parts of the polypeptide chain manifest higher strain, as a result of packing or functional requirements. An alternative and the most likely source of strain is the error in the coordinates of a structure, ranging from misplaced sidechains and flipped peptide groups to wrong chain tracing [46]. Similarly, the dynamic nature of a biopolymer in solution and the ambiguities in peak assignments may result in errors in the structures solved by NMR. Exam
An anytime localtoglobal optimization algorithm for protein threading in theta (m2n2) space
 J Comput Biol
, 1999
"... This paper describes a novel anytime branchandbound or best � rst threading search algorithm for gapped block protein sequence–structure alignment with general sequence residue pair interactions. The new algorithm (1) returns a good approximate answer quickly, (2) iteratively improves that answer ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
This paper describes a novel anytime branchandbound or best � rst threading search algorithm for gapped block protein sequence–structure alignment with general sequence residue pair interactions. The new algorithm (1) returns a good approximate answer quickly, (2) iteratively improves that answer to the global optimum if allowed more time, (3) eventually produces a proof that the � nal answer found is indeed the global optimum, and (4) always terminates correctly within a bounded number of steps if allowed suf � cient space and time. It runs in polynomial space, which is asymptotically dominated by the O (m 2 ñ 2) space required by the lower bound computation. Using previously published data sets and the Bryant–Lawrence (1993) objective function, the algorithm found the true (proven) global optimum in less than 5 min in all search spaces size 10 25 or smaller (sequences to 478 residues), and a putative (not guaranteed) optimum in less than 5 hr in all search spaces size 10 60 or smaller (sequences to 793 residues, cores to 42 secondary structure segments). The threading in the largest case studied was eventually proven to be globally optimal; the corresponding search speed in that case was the equivalent of 1.5 £ 10 56 threadings/sec, a speedup exceeding 10 25 over previously published batch branchandbound speeds, and exceeding 10 50 over previously published exhaustive search speeds, using the same objective function and threading paradigm. Implementationindependent measures of search ef � ciency are de � ned for equivalent branching factor, depth, and probability of success per draw; empirical data on these measures are given. The general approach should apply to other alignment methodologies and search methods that use a divideandconquer strategy.
Modelbased search to determine minima in molecular energy landscapes
, 2005
"... Search for the global minimum in a molecular energy landscape populated with numerous local minima is a difficult task. Search techniques relevant to such complex spaces can be classified as either global or local. Global search explores the entire space, guaranteeing the global extremum will be fou ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Search for the global minimum in a molecular energy landscape populated with numerous local minima is a difficult task. Search techniques relevant to such complex spaces can be classified as either global or local. Global search explores the entire space, guaranteeing the global extremum will be found. To accomplish this, the number of samples required grows exponentially with the number of dimensions. Since this is clearly not computationally tractable, global search is impractical in highdimensional spaces. Local search, on the other hand, employs gradient descent to avoid searching the entire exponential space. Gradient descent methods are susceptible to getting stalled in local minima and consequently, no guarantees can be made about finding the global minimum. We propose a middle ground that minimizes the effects of exponential space and local minima by integrating domain knowledge and information generated during search into a model, and then using this model to focus computation on regions of increasing relevance. Directing resources to multiple relevant regions prevents oversampling local minima. At the same time the exploration of only significant regions avoids the intractable computational requirements of highdimensional spaces. The proposed method, called ModelBased Search (MBS), is compared to the local search method Monte Carlo as implemented in Rosetta currently considered the best computational protein structure prediction method. The results indicate that MBS is significantly better at finding lower energy minima than the Monte Carlo technique implemented as part of Rosetta. This effect is amplified as the dimensionality of the search space increases. 1
Predictive methods using protein sequences
 Bioinformatics, A Practical Guide to the Analysis of Genes and Proteins, chapter 11
, 1998
"... ..."
(Show Context)