• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Large-scale prediction of disulphide bridges using kernel methods, two-dimensional recursive neural networks, and weighted graph matching. (2006)

by J Cheng, H Saigo, P Baldi
Venue:Proteins,
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 40
Next 10 →

NNcon: Improved Protein Contact Map Prediction Using 2D-Recursive Neural Networks

by Allison N. Tegge, Zheng Wang, Jesse Eickholt, Jianlin Cheng - Nucleic Acids Research , 2009
"... doi:10.1093/nar/gkp305 ..."
Abstract - Cited by 22 (3 self) - Add to MetaCart
doi:10.1093/nar/gkp305

DISULFIND: a disulfide bonding state and cysteine connectivity prediction server

by Alessio Ceroni, Andrea Passerini, Ro Vullo, Paolo Frasconi - Nucleic Acids Res , 2006
"... DISULFIND is a server for predicting the disulfide bonding state of cysteines and their disulfide connectivity starting from sequence alone. Optionally, disulfide connectivity can be predicted from sequence and a bonding state assignment given as input. The output is a simple visualization of the as ..."
Abstract - Cited by 17 (2 self) - Add to MetaCart
DISULFIND is a server for predicting the disulfide bonding state of cysteines and their disulfide connectivity starting from sequence alone. Optionally, disulfide connectivity can be predicted from sequence and a bonding state assignment given as input. The output is a simple visualization of the assigned bonding state (with confidence degrees) and the most likely connectivity patterns. The server is available at
(Show Context)

Citation Context

...ant using dipeptides as features. Martelli et al. (6) suggested the use of hidden Markov models to refine local predictions obtained via neural networks. SVMs are also used in the method presented in =-=(7)-=-. Prediction of connectivity patterns was pioneered in (8) with a method based on weighted graph matching, implemented in the prediction server DCON. Vullo and Frasconi (9) introduced the use of multi...

Identifying Cysteines and Histidines in Transition-Metal-Binding Sites Using Support Vector Machines and Neural Networks

by Andrea Passerini, Marco Punta, Alessio Ceroni, Burkhard Rost, Paolo Frasconi
"... ABSTRACT Accurate predictions of metal-binding sites in proteins by using sequence as the only source of information can significantly help in the prediction of protein structure and function, genome annotation, and in the experimental determination of protein structure. Here, we introduce a method ..."
Abstract - Cited by 17 (10 self) - Add to MetaCart
ABSTRACT Accurate predictions of metal-binding sites in proteins by using sequence as the only source of information can significantly help in the prediction of protein structure and function, genome annotation, and in the experimental determination of protein structure. Here, we introduce a method for identifying histidines and cysteines that participate in binding of several transition metals and iron complexes. The method predicts histidines as being in either of two states (free or metal bound) and cysteines in either of three states (free, metal bound, or in disulfide bridges). The method uses only sequence information by utilizing position-specific evolutionary profiles as well as more global descriptors such as protein length and amino acid composition. Our solution is based on a two-stage machine-learning approach. The first stage consists of a support vector machine trained to locally classify the binding state of single histidines and cysteines. The second stage consists of a bidirectional recurrent neural network trained to refine local predictions by taking into account dependencies among residues within the same protein. A simple finite state automaton is employed as a postprocessing in the second stage in order to enforce an even number of disulfide-bonded cysteines. We predict histidines and cysteines in transition-metal-binding sites at 73% precision and 61 % recall. We observe significant differences in performance depending on the ligand (histidine or cysteine) and on the metal bound. We also predict cysteines participating in disulfide bridges at 86% precision and 87 % recall. Results are compared to those that would be obtained by using expert information as represented by PROSITE motifs and, for disulfide bonds, to state-of-the-art methods. Proteins 2006;

Disulfide bonding state prediction with svm based on protein types,” Bio-Inspired Computing: Theories and Applications

by Chih-ying Lin, Chang-biau Yang, Chiou-yi Hor, Kuo-si Huang , 2010
"... Abstract—Disulfide bonds play the key role for pre-dicting the three-dimensional structure and the function of a protein. In this paper, we propose an algorithm for predicting the disulfide bonding state of each cysteine in a protein sequence. This method is based on the multi-stage framework and th ..."
Abstract - Cited by 3 (3 self) - Add to MetaCart
Abstract—Disulfide bonds play the key role for pre-dicting the three-dimensional structure and the function of a protein. In this paper, we propose an algorithm for predicting the disulfide bonding state of each cysteine in a protein sequence. This method is based on the multi-stage framework and the multi-classifier of the support vector machine. We also design a new training strategy to increase the prediction accuracy. It appends the probabilities to the existing features and then starts a new training procedure repeatedly to improve performance. We perform the experiments on the data set derived from the well-known database Protein Data Bank (PDB). We get 94.2% accuracy for predicting disulfide bonding state, which gets improvement 3.5 % compared with the previous best result 90.7%. Index Terms—disulfide bond; bioinfomatics; support vector machine; cysteine state prediction; I.
(Show Context)

Citation Context

...bic core of the folded protein through condensing hydrophobic residues around itself. Therefore, the disulfide bond might have significant information for protein structure and function. Cheng et al. =-=[8]-=- defined the disulfide bond prediction problem as the classification problem of four different levels. First, a protein may have several chains. Researchers may want to know which protein chains conta...

Private correspondence

by Yong-zhen Zhang , 1998
"... Molecular diversity and phylogeny of Hantaan virus in Guizhou, China: evidence for Guizhou as a radiation center of the present Hantaan virus ..."
Abstract - Cited by 3 (0 self) - Add to MetaCart
Molecular diversity and phylogeny of Hantaan virus in Guizhou, China: evidence for Guizhou as a radiation center of the present Hantaan virus
(Show Context)

Citation Context

...e-bond reshuffling in RNase 8 evolution: DISULFIND analysis Because computational predictions of disulfide bonds are not always correct, we tried two other commonly used prediction algorithms, DiPro (=-=Cheng et al. 2006-=-) and DISULFIND (Vullo and Frasconi 2004). DiPro uses machine-learning methods to predict whether a given 9sprotein chain contains intrachain disulfide bonds and uses recursive neural networks to pred...

Machine Learning Methods for Protein Structure Prediction

by Jianlin Cheng, Allison N. Tegge, Pierre Baldi, Senior Member
"... Abstract—Machine learning methods are widely used in bioinformatics and computational and systems biology. Here, we review the development of machine learning methods for protein structure prediction, one of the most fundamental problems in structural biology and bioinformatics. Protein structure pr ..."
Abstract - Cited by 3 (0 self) - Add to MetaCart
Abstract—Machine learning methods are widely used in bioinformatics and computational and systems biology. Here, we review the development of machine learning methods for protein structure prediction, one of the most fundamental problems in structural biology and bioinformatics. Protein structure prediction is such a complex problem that it is often decomposed and attacked at four different levels: 1-D prediction of structural features along the primary sequence of amino acids; 2-D prediction of spatial relationships between amino acids; 3-D prediction of the tertiary structure of a protein; and 4-D prediction of the quaternary structure of a multiprotein complex. A diverse set of both supervised and unsupervised machine learning methods has been applied over the years to tackle these problems and has significantly contributed to advancing the state-of-the-art of protein structure prediction. In this paper, we review the development and application of hidden Markov models, neural networks, support vector machines, Bayesian methods, and clustering methods in 1-D, 2-D, 3-D, and 4-D protein structure predictions. Index Terms—Bioinformatics, machine learning, protein folding, protein structure prediction. I.
(Show Context)

Citation Context

...rotein sequence (Fig. 2). The 2-D prediction focuses on predicting the spatial relationship between residues, such as distance and contact map prediction [28], [29] and disulfide bond prediction [30]–=-=[33]-=- (Fig. 3). One essential characteristic of these 2-D representations is that they are independent of any rotations and translations of the protein, therefore independent of any frame of coordinates, w...

Comparative Analysis of Disulfide Bond Determination Using Computational-Predictive Methods and Mass Spectrometry-Based Algorithmic Approach”,

by Timothy Lee , Rahul Singh - CCIS , 2008
"... Abstract. Identifying the disulfide bonding pattern in a protein is critical to understanding its structure and function. At the state-of-the-art, a large number of computational strategies have been proposed that predict the disulfide bonding pattern using sequence-level information. Recent past h ..."
Abstract - Cited by 2 (1 self) - Add to MetaCart
Abstract. Identifying the disulfide bonding pattern in a protein is critical to understanding its structure and function. At the state-of-the-art, a large number of computational strategies have been proposed that predict the disulfide bonding pattern using sequence-level information. Recent past has also seen a spurt in the use of Mass spectrometric (MS) methods in proteomics. Mass spectrometrybased analysis can also be used to determine disulfide bonds. Furthermore, MS methods can work with lower sample purity when compared with x-ray crystallography or NMR. However, without the assistance of computational techniques, MS-based identification of disulfide bonds is time-consuming and complicated. In this paper we present an algorithmic solution to this problem and examine how the proposed method successfully deals with some of the key challenges in mass spectrometry. Using data from the analysis of nine eukaryotic Glycosyltransferases with varying numbers of cysteines and disulfide bonds we perform a detailed comparative analysis between the MS-based approach and a number of computational-predictive methods. These experiments highlight the tradeoffs between these classes of techniques and provide critical insights for further advances in this important problem domain.
(Show Context)

Citation Context

...infer) the disulfide connectivity based on sequence data. (3) Mass-spectrometry-based techniques that detect disulfide bonded peptides by analyzing a mixture of peptides obtained by targeted digestion of an intact protein. Crystallographic methods can be used to study a subdomain of the protein that is sufficiently soluble and may form crystals. However, such methods can rarely be used in medium or high-throughput settings. Consequently, in the recent past, significant attention has been given to computational methods that can predict disulfide connectivity based on sequence information alone [2, 3, 4, 5, 6, 7, 8, 9, 10, 27]. An important advantage of these predictive methods lies in the fact that they require only sequence-level data to make predictions. Recent results in this area have reported high accuracies with Qp values (fraction of proteins in the test set with disulfide connectivity correctly predicted) in the 70 – 78% range. These methods also report high Qc (sensitivity) values. However, in interpreting, extrapolating, and understanding these performance values, the following considerations are especially critical: 1. Most of the reported results use a dataset called SP39 of non-redundant sequences der...

Machine Learning Algorithms for Protein Structure Prediction

by Jianlin Cheng , 2006
"... ..."
Abstract - Cited by 1 (1 self) - Add to MetaCart
Abstract not found

Disease risk of missense mutations using structural inference from predicted function, Curr. Protein Pept

by Jeremy A Horst , Kai Wang , Orapin V Horst , Michael L Cunningham , Ram Samudrala - Sci , 2010
"... Abstract: Advancements in sequencing techniques place personalized genomic medicine upon the horizon, bringing along the responsibility of clinicians to understand the likelihood for a mutation to cause disease, and of scientists to separate etiology from nonpathologic variability. Pathogenicity is ..."
Abstract - Cited by 1 (1 self) - Add to MetaCart
Abstract: Advancements in sequencing techniques place personalized genomic medicine upon the horizon, bringing along the responsibility of clinicians to understand the likelihood for a mutation to cause disease, and of scientists to separate etiology from nonpathologic variability. Pathogenicity is discernable from patterns of interactions between a missense mutation, the surrounding protein structure, and intermolecular interactions. Physicochemical stability calculations are not accessible without structures, as is the case for the vast majority of human proteins, so diagnostic accuracy remains in infancy. To model the effects of missense mutations on functional stability without structure, we combine novel protein sequence analysis algorithms to discern spatial distributions of sequence, evolutionary, and physicochemical conservation, through a new approach to optimize component selection. Novel components include a combinatory substitution matrix and two heuristic algorithms that detect positions which confer structural support to interaction interfaces. The method reaches 0.91 AUC in ten-fold cross-validation to predict alteration of function for 6,392 in vitro mutations. For clinical utility we trained the method on 7,022 disease associated missense mutations within the Online Mendelian inheritance in man amongst a larger randomized set. In a blinded prospective test to delineate mutations unique to 186 patients with craniosynostosis from those in the 95 highly variant Coriell controls and 2000 control chromosomes, we achieved roughly 1/3 sensitivity and perfect specificity. The component algorithms retained during machine learning constitute novel protein sequence analysis techniques to describe environments supporting neutrality or pathology of mutations. This approach to pathogenetics enables new insight into the mechanistic relationship of missense mutations to disease phenotypes in our patients.
(Show Context)

Citation Context

...ations generally and interaction interface support residues specifically. We use existing sequence analytic knowledge based algorithms to predict secondary structure, solvent exposure, burial, disorder, domain restraints, and nonlocal contact prediction at multiple shell radii to substitute tertiary structure information. All sequence analytic methods applied here are implemented on the results from a single default PSI-BLAST run [21]. The structure features are predicted using the suite of software kindly provided to the community by Jianlin Cheng, selecting ab initio methods where available [22, 23, 24, 25, 26, 27]. These methods performed as the best or near best in each related category of the 8th Community wide experiment on the critical assessment of methods for protein structure prediction (CASP8) [28]. In this work we demonstrate how these predicted structural parameters can derive functional importance, thereby finessing dependence on high quality structural data for the problem of separating insignificant missense mutations from disease risk inducing mutations. Relation to other Methods for Predicting Phenotypic Missense Mutations Amino Acid Substitution Matrices It is unclear what data set firs...

PERMISSION TO USE

by Courtney Solheim
"... In presenting this thesis in partial fulfillment of the requirements for a Postgraduate degree from the University of Saskatchewan, I agree that the Libraries of this University may make it freely available for inspection. I further agree that permission for copying of this thesis in any manner, in ..."
Abstract - Add to MetaCart
In presenting this thesis in partial fulfillment of the requirements for a Postgraduate degree from the University of Saskatchewan, I agree that the Libraries of this University may make it freely available for inspection. I further agree that permission for copying of this thesis in any manner, in whole or in part, for scholarly purposes may be granted by the professor or professors who supervised my thesis work or, in their absence, by the Head of the Department or the Dean of the College in which my thesis work was done. It is understood that any copying or publication or use of this thesis or parts thereof for financial gain shall not be allowed without my written permission. It is also understood that due recognition shall be given to me and to the University of Saskatchewan in any scholarly use which may be made of any material in my thesis.
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University