• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Prediction of protein secondary structure at better than 70% accuracy (1993)

by B Rost, C Sander
Venue:J. Mol. Biol
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 759
Next 10 →

Protein Secondary Structure Prediction . . .

by David T. Jones - JO MOL B/O/. (1999) 292, 195-202 .GG , 1999
"... ..."
Abstract - Cited by 945 (19 self) - Add to MetaCart
Abstract not found

The PredictProtein server

by Burkhard Rost, Guy Yachdav, Jinfeng Liu , 2004
"... ..."
Abstract - Cited by 228 (21 self) - Add to MetaCart
Abstract not found

Multi-class Protein Fold Recognition Using Support Vector Machines and Neural Networks

by Chris H. Q. Ding, Inna Dubchak - Bioinformatics , 2001
"... Motivation: Protein fold recognition is an important approach to structure discovery without relying on sequence similarity. We study this approach with new multi-class classication methods and examined many issues important for a practical recognition system. Results: Most current discriminative ..."
Abstract - Cited by 207 (8 self) - Add to MetaCart
Motivation: Protein fold recognition is an important approach to structure discovery without relying on sequence similarity. We study this approach with new multi-class classication methods and examined many issues important for a practical recognition system. Results: Most current discriminative methods for protein fold prediction use the one-againstothers method, which has the well-known \False Positives" problem. We investigated two new methods: the unique one-against-others and the all-against-all methods. Both improve prediction accuracy by 14-110% on a dataset containing 27 SCOP folds. We used the Support Vector Machine and the Neural Network learning methods as base classiers. SVM converges fast and leads to high accuracy. When scores of multiple parameter datasets are combined, majority voting reduces noise and increases recognition accuracy. We examined many issues involved with large number of classes, including dependencies of prediction accuracy on the number of folds and on the number of representatives in a fold. Overall, recognition systems achieve 56% fold prediction accuracy on a protein test dataset, where most of the proteins have below 25% sequence identity with the proteins used in training. Contact: chqding@lbl.gov, ildubchak@lbl.gov Supplementary Information: The protein parameter datasets used in this paper is available online (http://www.nersc.gov/ cding/protein). Keywords: protein fold recognition, protein structure, multi-class classication, support vection machines, neural networks. To whom correspondence should be addressed. 1
(Show Context)

Citation Context

..., there are no such concepts as true positives and false positives. Therefore we need an accuracy measure which can deal with all situations. In this paper, we use the standard Q percentage accuracy (=-=Rost & Sander, 1-=-993, Baldi et al, 2000), generalized to handle true positives and false positives. Suppose we have N = n 1 + n 2 + + nK test proteins (n 1 are observed to belong to class F1, etc.). Suppose that ou...

Review: Protein Secondary Structure Prediction Continues to Rise

by Burkhard Rost - J. Struct. Biol , 2001
"... f prediction accuracy? We shall see. 2001 Academic Press INTRODUCTION History. Linus Pauling correctly guessed the formation of helices and strands (14, 15) (and falsely hypothesized other structures). Three years before Pauling's guess was verified by the publications of the first X-ray stru ..."
Abstract - Cited by 180 (22 self) - Add to MetaCart
f prediction accuracy? We shall see. 2001 Academic Press INTRODUCTION History. Linus Pauling correctly guessed the formation of helices and strands (14, 15) (and falsely hypothesized other structures). Three years before Pauling's guess was verified by the publications of the first X-ray structures (16, 17), one group had already ventured to predict secondary structure from sequence (18). The first-generation prediction methods following in the 1960s and 1970s were all based on single amino acid propensities (19). The second-generation methods dominating the scene until the early 1990s used propensities for segments of 3--51 adjacent residues (19). Basically any imaginable theoretical algorithm had been applied to the problem of predicting secondary structure from sequence. However, it seemed that prediction accuracy stalled at levels slightly above 60% (percentage of residues predicted correctly in one of the three states: helix, strand, and other). The reason for this limit was the
(Show Context)

Citation Context

...pt into an automatic prediction method. However, the breakthrough of the third-generation methods to levels above 70% accuracy required a combination of larger databases with more advanced algorithms =-=(19, 22)-=-. The major component of these new methods was the use of evolutionary information. All naturally evolved proteins with more than 35% pairwise identical residues over more than 100 aligned residues ha...

Prediction and functional analysis of native disorder in proteins from the three kingdoms of life

by J. J. Ward, J. S. Sodhi, L. J. Mcguffin, B. F. Buxton, D. T. Jones, Bioinformatics Unit - J. Mol. Biol , 2004
"... One of the central tenets of structural biology is that the function of a protein is determined by its three-dimensional structure. As a result, predicting protein structure has often been at the forefront of ..."
Abstract - Cited by 155 (4 self) - Add to MetaCart
One of the central tenets of structural biology is that the function of a protein is determined by its three-dimensional structure. As a result, predicting protein structure has often been at the forefront of

Exploiting the Past and the Future in Protein Secondary Structure Prediction

by Pierre Baldi, Søren Brunak, Paolo Frasconi, Giovanni Soda, Gianluca Pollastri , 1999
"... Motivation: Predicting the secondary structure of a protein (alpha-helix, beta-sheet, coil) is an important step towards elucidating its three dimensional structure, as well as its function. Presently, the best predictors are based on machine learning approaches, in particular neural network archite ..."
Abstract - Cited by 154 (30 self) - Add to MetaCart
Motivation: Predicting the secondary structure of a protein (alpha-helix, beta-sheet, coil) is an important step towards elucidating its three dimensional structure, as well as its function. Presently, the best predictors are based on machine learning approaches, in particular neural network architectures with a fixed, and relatively short, input window of amino acids, centered at the prediction site. Although a fixed small window avoids overfitting problems, it does not permit to capture variable long-ranged information. Results: We introduce a family of novel architectures which can learn to make predictions based on variable ranges of dependencies. These architectures extend recurrent neural networks, introducing non-causal bidirectional dynamics to capture both upstream and downstream information. The prediction algorithm is completed by the use of mixtures of estimators that leverage evolutionary information, expressed in terms of multiple alignments, both at the input and output levels. While our system currently achieves an overall performance close to 76% correct prediction---at least comparable to the best existing systems---the main emphasis here is on the development of new algorithmic ideas. Availability: The executable program for predicting protein secondary structure is available from the authors free of charge. Contact: pfbaldi@ics.uci.edu, gpollast@ics.uci.edu, brunak@cbs.dtu.dk, paolo@dsi.unifi.it. 1

Prediction of local structure in proteins using a library of sequence-structure motifs

by Christopher Bystroff, David Baker - J. MOL. BIOL , 1998
"... ..."
Abstract - Cited by 152 (20 self) - Add to MetaCart
Abstract not found
(Show Context)

Citation Context

...ent. Combination of I-sites and conventional secondary structure predictions To determine whether I-sites local structure predictions are complementary to three-state secondary structure predictions (=-=Rost & Sander, 1993-=-a,b, 1994), the secondary structure of the proteins in the test set was predicted using the PHD server (Rost et al., 1994). Comparison of the results of the two methods requires either translating the...

Topology Prediction for Helical Transmembrane Proteins at 86% Accuracy

by Burkhard Rost, Piero Fariselli, Rita Casadio - Protein Sci , 1996
"... Previously, we introduced a neural network system predicting locations of transmembrane helices based on evolutionary profiles (PHDhtm, (Rost et al., 1995). Here, we describe an improvement and an extension of that system. The improvement is achieved by a dynamic programming-like algorithm that opt ..."
Abstract - Cited by 126 (17 self) - Add to MetaCart
Previously, we introduced a neural network system predicting locations of transmembrane helices based on evolutionary profiles (PHDhtm, (Rost et al., 1995). Here, we describe an improvement and an extension of that system. The improvement is achieved by a dynamic programming-like algorithm that optimises helices compatible with the neural network output. The extension is the prediction of topology (orientation of first loop region with respect to membrane) by applying to the refined prediction the observation that positively charged residues are more abundant in extra-cytoplasmic regions. Furthermore, we introduce a method to reduce the number of false positives, i.e., proteins falsely predicted with membrane helices. The evaluation of prediction accuracy is based on a cross-validation and a double-blind test set (in total 131 proteins). The final method appears to be more accurate than other methods published. (1) For almost 89% (3%) of the test proteins all transmembrane helices are predicted correctly. (2) For more than 86% (3%) of the proteins topology is predicted correctly. (3) We define reliability indices which correlate with prediction accuracy: for one half of the proteins segment accuracy raises to 98%; and for two-thirds accuracy of topology prediction is 95%. (4) The rate of proteins for which transmembrane helices are predicted falsely is below 2% (1%). Finally, the method is applied to 1616 sequences of Haemophilus influenzae. We predict 19% of the genome sequences to contain one or more transmembrane helices. This appears to be lower than what we predicted previously for the yeast VIII chromosome (about 25%).

Hybrid fold recognition: Combining sequence derived properties with evolutionary information.

by D Fischer - Pac. Symp. Biocomput. , 2000
"... ..."
Abstract - Cited by 102 (7 self) - Add to MetaCart
Abstract not found

Disk-covering, a fast-converging method for phylogenetic tree reconstruction

by Daniel H. Huson, Scott M. Nettles, Tandy J. Warnow - JOURNAL OF COMPUTATIONAL BIOLOGY , 1999
"... The evolutionary history of a set of species is represented by a phylogenetic tree, which is a rooted, leaf-labeled tree, where internal nodes represent ancestral species and the leaves represent modern day species. Accurate (or even boundedly inaccurate) topology reconstructions of large and diverg ..."
Abstract - Cited by 92 (10 self) - Add to MetaCart
The evolutionary history of a set of species is represented by a phylogenetic tree, which is a rooted, leaf-labeled tree, where internal nodes represent ancestral species and the leaves represent modern day species. Accurate (or even boundedly inaccurate) topology reconstructions of large and divergent trees from realistic length sequences have long been considered one of the major challenges in systematic biology. In this paper, we present a simple method, the Disk-Covering Method (DCM), which boosts the performance of base phylogenetic methods under various Markov models of evolution. We analyze the performance of DCM-boosted distance methods under the Jukes–Cantor Markov model of biomolecular sequence evolution, and prove that for almost all trees, polylogarithmic length sequences suffice for complete accuracy with high probability, while polynomial length sequences always suffice. We also provide an experimental study based upon simulating sequence evolution on model trees. This study confirms substantial reductions in error rates at realistic sequence lengths.
(Show Context)

Citation Context

...ogy (Dobzhansky, 1993). [Evolutionary trees are often the basis of multiple sequence alignment algorithms (Gus� eld, 1991; Gus� eld and Wang, 1996; Hein, 1989), protein structure prediction routines (=-=Rost and Sander, 1993-=-), and other problems in biology.] Experimentally investigating the performance of phylogenetic methods by simulating sequence evolution on different model trees in order to determine how the sequence...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University