• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Single-body residue-level knowledge-based energy score combined with sequence-profile and secondary structure information for fold recognition (2004)

by H Zhou, Y Zhou
Venue:Proteins
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 76
Next 10 →

A Machine Learning Information Retrieval Approach to Protein Fold Recognition

by Jianlin Cheng, Pierre Baldi
"... Motivation: Recognizing proteins that have similar tertiary structure is the key step of template-based protein structure prediction methods. Traditionally, a variety of alignment methods are used to identify similar folds, based on sequence similarity and sequencestructure compatibility. Although t ..."
Abstract - Cited by 78 (12 self) - Add to MetaCart
Motivation: Recognizing proteins that have similar tertiary structure is the key step of template-based protein structure prediction methods. Traditionally, a variety of alignment methods are used to identify similar folds, based on sequence similarity and sequencestructure compatibility. Although these methods are complementary, their integration has not been thoroughly exploited. Statistical machine learning methods provide tools for integrating multiple features, but so far these methods have been used primarily for protein and fold classification, rather than addressing the retrieval problem of fold recognition–finding a proper template for a given query protein. Results: Here we present a two-stage machine learning, information retrieval, approach to fold recognition. First, we use alignment methods to derive pairwise similarity features for query-template protein pairs. We also use global profile-profile alignments in combination with predicted secondary structure, relative solvent accessibility, contact map, and beta-strand pairing to extract pairwise structural compatibility features. Second, we apply support vector machines to these features to predict the structural relevance (i.e. in the same fold or not) of the query-template pairs. For each query, the continuous relevance scores are used to rank the templates. The FOLDpro approach is modular, scalable, and effective. Compared to 11 other fold recognition methods, FOLDpro yields the best results in almost all standard categories on a comprehensive benchmark dataset. Using predictions of the top-ranked template, the sensitivity is about 85%, 56%, and 27 % at the family, superfamily, and fold levels respectively. Using the 5 top-ranked templates, the sensitivity increases to 90%, 70%, and 48%. Availability: The FOLDpro server is available with the SCRATCH

Generalized protein structure prediction based on combination of fold-recognition with de novo folding and evaluation of models

by Janusz M. Bujnicki - Proteins , 2005
"... ABSTRACT To predict the tertiary structure of full-length sequences of all targets inCASP6, regard-less of their potential category (from easy compara-tive modeling to fold recognition to apparent new folds) we used a novel combination of two very different approaches developed independently in our ..."
Abstract - Cited by 34 (17 self) - Add to MetaCart
ABSTRACT To predict the tertiary structure of full-length sequences of all targets inCASP6, regard-less of their potential category (from easy compara-tive modeling to fold recognition to apparent new folds) we used a novel combination of two very different approaches developed independently in our laboratories, which ranked quite well in differ-ent categories in CASP5. First, the GeneSilico meta-serverwas used to identify domains, predict second-ary structure, and generate fold recognition (FR) alignments, whichwere converted to full-atommod-els using the “FRankenstein’s Monster ” approach for comparative modeling (CM) by recombination of protein fragments. Additional models generated “de novo ” by fully automated servers were obtained

Y: SP5: improving protein fold recognition by using torsion angle profiles and profile-based gap penalty model. PLoS One 2008

by Wei Zhang, Song Liu, Yaoqi Zhou
"... How to recognize the structural fold of a protein is one of the challenges in protein structure prediction. We have developed a series of single (non-consensus) methods (SPARKS, SP 2,SP 3,SP 4) that are based on weighted matching of two to four sequence and structure-based profiles. There is a robus ..."
Abstract - Cited by 18 (1 self) - Add to MetaCart
How to recognize the structural fold of a protein is one of the challenges in protein structure prediction. We have developed a series of single (non-consensus) methods (SPARKS, SP 2,SP 3,SP 4) that are based on weighted matching of two to four sequence and structure-based profiles. There is a robust improvement of the accuracy and sensitivity of fold recognition as the number of matching profiles increases. Here, we introduce a new profile-profile comparison term based on real-value dihedral torsion angles. Together with updated real-value solvent accessibility profile and a new variable gap-penalty model based on fractional power of insertion/deletion profiles, the new method (SP 5) leads to a robust improvement over previous SP method. There is a 2 % absolute increase (5 % relative improvement) in alignment accuracy over SP 4 based on two independent benchmarks. Moreover, SP 5 makes 7 % absolute increase (22 % relative improvement) in success rate of recognizing correct structural folds, and 32 % relative improvement in model accuracy of models within the same fold in Lindahl benchmark. In addition, modeling accuracy of top-1 ranked models is improved by 12 % over SP 4 for the difficult targets in CASP 7 test set. These results highlight the importance of harnessing predicted structural properties in
(Show Context)

Citation Context

...SP 2 ,SP 3 ,SP 4 ) that are based on weighted matching of multiple profiles that include sequence profiles generated from multiple sequence alignment [9], predicted versus actual secondary structures =-=[10,11]-=-, knowledge-based profile (single-body) score function [10], depth-dependent sequence profiles derived from template structures [11], and predicted versus actual solvent accessible surface area [12]. ...

Analysis of TASSER-based CASP7 protein structure prediction results

by Hongyi Zhou, Shashi B. Pandit, Seung Yup Lee, Jose Borreguero, Huiling Chen, Liliana Wroblewska, Jeffrey Skolnick - PROTEINS. 69 (SUPPL 8): TASSER_WT PROTEIN STRUCTURE PREDICTION 30751. MURZIN, A. G. 2001. PROGRESS IN PROTEIN STRUCTURE , 2007
"... ..."
Abstract - Cited by 13 (4 self) - Add to MetaCart
Abstract not found

Low-homology protein threading

by Jian Peng, Jinbo Xu - Bioinformatics 2010
"... Motivation: The challenge of template-based modeling lies in the recognition of correct templates and generation of accurate sequence-template alignments. Homologous information has proved to be very powerful in detecting remote homologs, as demonstrated by the state-of-the-art profile-based method ..."
Abstract - Cited by 9 (1 self) - Add to MetaCart
Motivation: The challenge of template-based modeling lies in the recognition of correct templates and generation of accurate sequence-template alignments. Homologous information has proved to be very powerful in detecting remote homologs, as demonstrated by the state-of-the-art profile-based method HHpred. However, HHpred does not fare well when proteins under consideration are low-homology. A protein is low-homology if we cannot obtain sufficient amount of homologous information for it from existing protein sequence databases. Results: We present a profile-entropy dependent scoring function for low-homology protein threading. This method will model correlation among various protein features and determine their relative importance according to the amount of homologous information available. When proteins under consideration are low-homology, our method will rely more on structure information; otherwise, homologous information. Experimental results indicate that our threading method greatly outperforms the best profile-based method HHpred and all the top CASP8 servers on low-homology proteins. Tested on the CASP8 hard targets, our threading method is also better than all the top CASP8 servers but slightly worse than Zhang-Server. This is significant considering that Zhang-Server and other top CASP8 servers use a combination of multiple structure-prediction techniques including consensus method, multiple-template modeling, template-free modeling and model refinement while our method is a classical single-template-based threading method without any post-threading refinement. Contact:
(Show Context)

Citation Context

...e databases (see Section 2 for quantitative definition). Many threading methods, such as MUSTER (Wu and Zhang, 2008), Phyre2 (Kelley and Sternberg, 2009) and SPARKS/SP3/SP5 (Zhang et al., 2004, 2008; =-=Zhou and Zhou 2004-=-, 2005), aim at going beyond profile-based methods by combining homologous information with a variety of structure information. However, recent CASP evaluations (Moult et al., 2005, 2007) demonstrate ...

QBES: predicting real values of solvent accessibility from sequences by efficient, constrained energy optimization. Proteins 2006;63:961–966

by Zhigang Xu, Chi Zhang, Song Liu, Yaoqi Zhou
"... ABSTRACT Solvent accessibility, one of the key properties of amino acid residues in proteins, can be used to assist protein structure prediction. Various approaches such as neural network, sup-port vector machines, probability profiles, informa-tion theory, Bayesian theory, logistic function, and mu ..."
Abstract - Cited by 5 (2 self) - Add to MetaCart
ABSTRACT Solvent accessibility, one of the key properties of amino acid residues in proteins, can be used to assist protein structure prediction. Various approaches such as neural network, sup-port vector machines, probability profiles, informa-tion theory, Bayesian theory, logistic function, and multiple linear regression have been developed for solvent accessibility prediction. In this article, a much simpler quadratic programmingmethodbased on the buriability parameter set of amino acid resi-dues is developed. The new method, called QBES (Quadratic programming and Buriability Energy function for Solvent accessibility prediction), is rea-sonably accurate for predicting the real value of solvent accessibility. By using a dataset of 30 pro-teins to optimize three parameters, the average correlation coefficients between the predicted and actual solvent accessibility are about 0.5 for all four independent test sets ranging from 126 to 513 pro-teins. The method is efficient. It takes only 20 min for a regular PC to obtain results of 30 proteins with an average length of 263 amino acids. Although the proposed method is less accurate than a few more sophisticated methods based on neural network or support vector machines, this is the first attempt to predict solvent accessibility by energy optimization with constraints. Possible improvements and other applications of the method are discussed. Proteins

Poleksic A: STRUCTFAST: Protein sequence remote homology detection and alignment using novel dynamic programming and profile-profile scoring. Proteins 2006

by Derek A. Debe, Joseph F. Danzer, William A. Goddard, Ar Poleksic
"... ABSTRACT STRUCTFAST is a novel profile— profile alignment algorithm capable of detecting weak similarities between protein sequences. The increased sensitivity and accuracy of the STRUCT-FAST method are achieved through several unique features. First, the algorithm utilizes a novel dy-namic programm ..."
Abstract - Cited by 4 (0 self) - Add to MetaCart
ABSTRACT STRUCTFAST is a novel profile— profile alignment algorithm capable of detecting weak similarities between protein sequences. The increased sensitivity and accuracy of the STRUCT-FAST method are achieved through several unique features. First, the algorithm utilizes a novel dy-namic programming engine capable of incorporat-ing important information from a structural family directly into the alignment process. Second, the algorithm employs a rigorous analytical formula for profile–profile scoring to overcome the limitations of ad hoc scoring functions that require adjustable parameter training. Third, the algorithm employs Convergent Island Statistics (CIS) to compute the statistical significance of alignment scores indepen-dently for each pair of sequences. STRUCTFAST routinely produces alignments that meet or exceed the quality obtained by an expert human homology modeler, as evidenced by its performance in the latest CAFASP4 and CASP6 blind prediction bench-mark experiments. Proteins 2006;64:960–967. © 2006Wiley-Liss, Inc. Key words: protein structure; homology modeling; comparative modeling; alignment algo-rithms; alignment statistics

Generalized Pattern Search Algorithm for Peptide Structure Prediction

by Giovanni Stracquadanio , 2008
"... AQ1Š ABSTRACT Finding the near-native structure of a protein is one of the most important open problems in structural biology and AQ2Š biological physics. The problem becomes dramatically more difficult when a given protein has no regular secondary structure or it does not show a fold similar to str ..."
Abstract - Cited by 3 (2 self) - Add to MetaCart
AQ1Š ABSTRACT Finding the near-native structure of a protein is one of the most important open problems in structural biology and AQ2Š biological physics. The problem becomes dramatically more difficult when a given protein has no regular secondary structure or it does not show a fold similar to structures already known. This situation occurs frequently when we need to predict the tertiary structure of small molecules, called peptides. In this research work, we propose a new ab initio algorithm, the generalized pattern search algorithm, based on the well-known class of Search-and-Poll algorithms. Inspired by the approach proposed by other researchers, we performed an extensive set of simulations over a well-known set of 44 peptides to investigate the robustness and reliability of the proposed algorithm, and we compared the peptide conformation with a state-of-the-art algorithm for peptide structure prediction known as PEPstr. In particular, we tested the algorithm on the instances proposed by the originators of PEPstr, to validate the proposed algorithm; the experimental results confirm that the generalized pattern search algorithm outperforms AQ3Š When analyzing the complex structure of a biological system, proteins are the most attracting molecular devices. They are likely involved in all processes of a living organism; they are responsible for behavioral changes in the cells. Due to the
(Show Context)

Citation Context

...ertain number of methods. In this class, LOMETS (27) is one of the best methods; it combines the output of nine of the most used algorithms in the literature (i.e., FUGUE (7); PROSPECT2 (28); SPARKS2 =-=(29)-=-; SP3 (30); SAM-T02 (31); HHSEARCH (32); PPA1 (27); PPA2 (27); and PAINT (27)). The Robetta server (15) combines homology modeling and de novo tertiary structure prediction with Ginzu homology identif...

Incorporation of Local Structural Preference Potential Improves Fold Recognition

by Yun Hu, Xiaoxi Dong, Aiping Wu, Yang Cao, Liqing Tian, Taijiao Jiang , 2010
"... Fold recognition, or threading, is a popular protein structure modeling approach that uses known structure templates to build structures for those of unknown. The key to the success of fold recognition methods lies in the proper integration of sequence, physiochemical and structural information. Her ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
Fold recognition, or threading, is a popular protein structure modeling approach that uses known structure templates to build structures for those of unknown. The key to the success of fold recognition methods lies in the proper integration of sequence, physiochemical and structural information. Here we introduce another type of information, local structural preference potentials of 3-residue and 9-residue fragments, for fold recognition. By combining the two local structural preference potentials with the widely used sequence profile, secondary structure information and hydrophobic score, we have developed a new threading method called FR-t5 (fold recognition by use of 5 terms). In benchmark testings, we have found the consideration of local structural preference potentials in FR-t5 not only greatly enhances the alignment accuracy and recognition sensitivity, but also significantly improves the quality of prediction models.
(Show Context)

Citation Context

...e alignment), model building, and model quality evaluation. The first two steps are the key steps in the TBM process, improvement of which can greatly improve the quality of the final predicted model =-=[1,2,3,4,5,6,7,8,9,10,11,12,13]-=-. For target sequences with high sequence similarity to those of structure templates, the structural templates can be easily identified and the target sequences can be reliably aligned to the structur...

Rychlewski L.: SURVEY AND SUMMARY, Practical lessons from protein structure prediction

by Krzysztof Ginalski, Nick V. Grishin, Adam Godzik, Leszek Rychlewski - 2005) 1874–1891. Zbigniew Starosolski, Andrzej Polański
"... Despite recent efforts to develop automated protein structure determination protocols, structural genomics projects are slow in generating fold assignments for complete proteomes, and spatial structures remain unknown for many protein families. Alternative cheap and fast methods to assign folds usin ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
Despite recent efforts to develop automated protein structure determination protocols, structural genomics projects are slow in generating fold assignments for complete proteomes, and spatial structures remain unknown for many protein families. Alternative cheap and fast methods to assign folds using prediction algorithms continue to provide valuable structural information for many proteins. The development of high-quality prediction methods has been boosted in the last years by objective community-wide assessment experiments. This paper gives an overview of the currently available practical approaches to protein structure prediction capable of generating accurate fold assignment. Recent advances in assessment of the prediction quality are also discussed.
(Show Context)

Citation Context

...o assess the alignment reliability. RAPTOR is quite new and was very successful in CASP-5. SPARKS (Sequence, secondary structure Profiles And Residue-level Knowledge-based Score for fold recognition) =-=(109)-=- uses single-body residuelevel knowledge-based energy score combined with sequence profile and secondary structure information for fold recognition. PROSPECT (PROtein Structure Prediction and Evaluati...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University