Results 1 - 10
of
43
Pcons: A neural-network-based consensus predictor that improves fold recognition
- Protein Sci
, 2001
"... improves fold recognition ..."
Valenzia A: Effective use of sequence correlation and conservation in fold recognition
- J Mol Biol
, 1999
"... Protein families are a rich source of information; sequence conservation and sequence correlation are two of the main properties that can be derived from the analysis of multiple sequence alignments. Sequence conservation is related to the direct evolutionary pressure to retain the chemical characte ..."
Abstract
-
Cited by 37 (3 self)
- Add to MetaCart
Protein families are a rich source of information; sequence conservation and sequence correlation are two of the main properties that can be derived from the analysis of multiple sequence alignments. Sequence conservation is related to the direct evolutionary pressure to retain the chemical characteristics of some positions in order to maintain a given function. Sequence correlation is attributed to the small sequence adjustments needed to maintain protein stability against constant mutational drift. Here, we showed that sequence conservation and correlation were each frequently informative enough to detect incorrectly folded proteins. Furthermore, combining conservation, correlation, and polarity, we achieved an almost perfect discrimination between native and incorrectly folded proteins. Thus, we made use of this information for threading by evaluating the models suggested by a threading method according to the degree of proximity of the corresponding correlated, conserved, and apolar residues. The results showed that the fold recognition capacity of a given threading approach could be improved almost fourfold by selecting the alignments that score best under the three different sequencebased approaches.
A Machine Learning Information Retrieval Approach to Protein Fold Recognition
"... Motivation: Recognizing proteins that have similar tertiary structure is the key step of template-based protein structure prediction methods. Traditionally, a variety of alignment methods are used to identify similar folds, based on sequence similarity and sequencestructure compatibility. Although t ..."
Abstract
-
Cited by 27 (5 self)
- Add to MetaCart
Motivation: Recognizing proteins that have similar tertiary structure is the key step of template-based protein structure prediction methods. Traditionally, a variety of alignment methods are used to identify similar folds, based on sequence similarity and sequencestructure compatibility. Although these methods are complementary, their integration has not been thoroughly exploited. Statistical machine learning methods provide tools for integrating multiple features, but so far these methods have been used primarily for protein and fold classification, rather than addressing the retrieval problem of fold recognition–finding a proper template for a given query protein. Results: Here we present a two-stage machine learning, information retrieval, approach to fold recognition. First, we use alignment methods to derive pairwise similarity features for query-template protein pairs. We also use global profile-profile alignments in combination with predicted secondary structure, relative solvent accessibility, contact map, and beta-strand pairing to extract pairwise structural compatibility features. Second, we apply support vector machines to these features to predict the structural relevance (i.e. in the same fold or not) of the query-template pairs. For each query, the continuous relevance scores are used to rank the templates. The FOLDpro approach is modular, scalable, and effective. Compared to 11 other fold recognition methods, FOLDpro yields the best results in almost all standard categories on a comprehensive benchmark dataset. Using predictions of the top-ranked template, the sensitivity is about 85%, 56%, and 27 % at the family, superfamily, and fold levels respectively. Using the 5 top-ranked templates, the sensitivity increases to 90%, 70%, and 48%. Availability: The FOLDpro server is available with the SCRATCH
Hidden markov models that use predicted local structure for fold recognition: alphabets of backbone geometry
- Proteins
, 2003
"... An important problem in computational biology is predicting the structure of the large number of pu-tative proteins discovered by genome sequencing projects. Fold-recognition methods attempt to solve the problem by relating the target proteins to known structures, searching for template proteins hom ..."
Abstract
-
Cited by 24 (10 self)
- Add to MetaCart
An important problem in computational biology is predicting the structure of the large number of pu-tative proteins discovered by genome sequencing projects. Fold-recognition methods attempt to solve the problem by relating the target proteins to known structures, searching for template proteins homologous to the target. Remote homologs which may have significant structural similarity are often not detectable by sequence similarities alone. To address this, we incorporated predicted local structure, a generalization of secondary structure, into two-track profile HMMs. We did not rely on a simple helix-strand-coil definition of secondary structure,
Structural genomics: computational methods for structure analysis
- Protein Sci
, 2003
"... service ..."
Protein Structure Prediction by Threading. - Why it Works and Why it Does Not
- J. Mol. Biol
, 1998
"... the quality of potentials used. These results are rationalized in terms of a threading free energy landscape. Possible ways to overcome the fundamental limitations of threading are discussed briey. # 1998 Academic Press Keywords: threading; Monte Carlo procedure; protein structure prediction *Corr ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
the quality of potentials used. These results are rationalized in terms of a threading free energy landscape. Possible ways to overcome the fundamental limitations of threading are discussed briey. # 1998 Academic Press Keywords: threading; Monte Carlo procedure; protein structure prediction *Corresponding author Introduction The problem of predicting protein conformation from sequences is of great importance and has drawn a lot of attention recently (see e.g. Moult et al., 1997; Shakhnovich, 1997a; Finkelstein, 1997; Jones, 1997; Levitt, 1997) with hundreds of papers from dozens of groups. A most desirable solution to the problem is to nd a model and an algorithm that stimulate folding of a protein pretty much in a way that mimics natural protein folding and converges to the native conformation. While some success along these lines has been documented (Kolinski & Skolnick, 1994), this approach encounters a number of serious technical difculties, making ab initio structure predict
DSSPcont: continuous secondary structure assignments for proteins
- Nucleic Acids Research
, 1983
"... The DSSP program automatically assigns the secondary structure for each residue from the threedimensional co-ordinates of a protein structure to one of eight states. However, discrete assignments are incomplete in that they cannot capture the continuum of thermal fluctuations. Therefore, DSSPcont ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
The DSSP program automatically assigns the secondary structure for each residue from the threedimensional co-ordinates of a protein structure to one of eight states. However, discrete assignments are incomplete in that they cannot capture the continuum of thermal fluctuations. Therefore, DSSPcont
Protein Sequence Threading: Averaging over Structures
, 2002
"... Multiplesequencealignmentsare aroutinetoolinproteinfoldrecognition,butmultiplestructurealignmentsarecomputationallyless cooperative.Thisworkdescribesamethodforproteinsequencethreadingandsequence -to-structure alignmentsthatusesmultiplealignedstructures, theaimbeingtoimprovemodelsfromprotein threadin ..."
Abstract
-
Cited by 8 (5 self)
- Add to MetaCart
Multiplesequencealignmentsare aroutinetoolinproteinfoldrecognition,butmultiplestructurealignmentsarecomputationallyless cooperative.Thisworkdescribesamethodforproteinsequencethreadingandsequence -to-structure alignmentsthatusesmultiplealignedstructures, theaimbeingtoimprovemodelsfromprotein threadingcalculations.Sequencesarealignedinto afieldduetocorrespondingsitesinhomologous proteins.Onthebasisofatestsetofmorethan570 proteinpairs,theproceduredoesimprovealignmentquality, althoughnomorethanaveragingover sequences.Fortheforcefieldtested,thebenefitof structureaveragingissmallerthanthatofadding sequencesimilaritytermsoracontributionfrom secondarystructurepredictions.Althoughthereis asignificantimprovementinthequalityofsequenceto -structurealignments,thisdoesnotdirectlytranslatetoanimmediateimprovementinfoldrecogni - tioncapability.Proteins2002;47:496--505.
Protein flexibility and intrinsic disorder
- Protein Sci
, 2004
"... Comparisons were made among four categories of protein flexibility: (1) low-B-factor ordered regions, (2) high-B-factor ordered regions, (3) short disordered regions, and (4) long disordered regions. Amino acid compositions of the four categories were found to be significantly different from each ot ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
Comparisons were made among four categories of protein flexibility: (1) low-B-factor ordered regions, (2) high-B-factor ordered regions, (3) short disordered regions, and (4) long disordered regions. Amino acid compositions of the four categories were found to be significantly different from each other, with high-Bfactor ordered and short disordered regions being the most similar pair. The high-B-factor (flexible) ordered regions are characterized by a higher average flexibility index, higher average hydrophilicity, higher average absolute net charge, and higher total charge than disordered regions. The low-B-factor regions are significantly enriched in hydrophobic residues and depleted in the total number of charged residues compared to the other three categories. We examined the predictability of the high-B-factor regions and developed a predictor that discriminates between regions of low and high B-factors. This predictor achieved an accuracy of 70 % and a correlation of 0.43 with experimental data, outperforming the 64 % accuracy and 0.32 correlation of predictors based solely on flexibility indices. To further clarify the differences between short disordered regions and ordered regions, a predictor of short disordered regions was developed. Its relatively high accuracy of 81 % indicates considerable differences between ordered and disordered regions. The distinctive amino acid biases of high-B-factor ordered regions, short disordered regions, and long disordered

