Results 1 - 10
of
40
NNcon: Improved Protein Contact Map Prediction Using 2D-Recursive Neural Networks
- Nucleic Acids Research
, 2009
"... doi:10.1093/nar/gkp305 ..."
DISULFIND: a disulfide bonding state and cysteine connectivity prediction server
- Nucleic Acids Res
, 2006
"... DISULFIND is a server for predicting the disulfide bonding state of cysteines and their disulfide connectivity starting from sequence alone. Optionally, disulfide connectivity can be predicted from sequence and a bonding state assignment given as input. The output is a simple visualization of the as ..."
Abstract
-
Cited by 17 (2 self)
- Add to MetaCart
(Show Context)
DISULFIND is a server for predicting the disulfide bonding state of cysteines and their disulfide connectivity starting from sequence alone. Optionally, disulfide connectivity can be predicted from sequence and a bonding state assignment given as input. The output is a simple visualization of the assigned bonding state (with confidence degrees) and the most likely connectivity patterns. The server is available at
Identifying Cysteines and Histidines in Transition-Metal-Binding Sites Using Support Vector Machines and Neural Networks
"... ABSTRACT Accurate predictions of metal-binding sites in proteins by using sequence as the only source of information can significantly help in the prediction of protein structure and function, genome annotation, and in the experimental determination of protein structure. Here, we introduce a method ..."
Abstract
-
Cited by 17 (10 self)
- Add to MetaCart
ABSTRACT Accurate predictions of metal-binding sites in proteins by using sequence as the only source of information can significantly help in the prediction of protein structure and function, genome annotation, and in the experimental determination of protein structure. Here, we introduce a method for identifying histidines and cysteines that participate in binding of several transition metals and iron complexes. The method predicts histidines as being in either of two states (free or metal bound) and cysteines in either of three states (free, metal bound, or in disulfide bridges). The method uses only sequence information by utilizing position-specific evolutionary profiles as well as more global descriptors such as protein length and amino acid composition. Our solution is based on a two-stage machine-learning approach. The first stage consists of a support vector machine trained to locally classify the binding state of single histidines and cysteines. The second stage consists of a bidirectional recurrent neural network trained to refine local predictions by taking into account dependencies among residues within the same protein. A simple finite state automaton is employed as a postprocessing in the second stage in order to enforce an even number of disulfide-bonded cysteines. We predict histidines and cysteines in transition-metal-binding sites at 73% precision and 61 % recall. We observe significant differences in performance depending on the ligand (histidine or cysteine) and on the metal bound. We also predict cysteines participating in disulfide bridges at 86% precision and 87 % recall. Results are compared to those that would be obtained by using expert information as represented by PROSITE motifs and, for disulfide bonds, to state-of-the-art methods. Proteins 2006;
Disulfide bonding state prediction with svm based on protein types,” Bio-Inspired Computing: Theories and Applications
, 2010
"... Abstract—Disulfide bonds play the key role for pre-dicting the three-dimensional structure and the function of a protein. In this paper, we propose an algorithm for predicting the disulfide bonding state of each cysteine in a protein sequence. This method is based on the multi-stage framework and th ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
(Show Context)
Abstract—Disulfide bonds play the key role for pre-dicting the three-dimensional structure and the function of a protein. In this paper, we propose an algorithm for predicting the disulfide bonding state of each cysteine in a protein sequence. This method is based on the multi-stage framework and the multi-classifier of the support vector machine. We also design a new training strategy to increase the prediction accuracy. It appends the probabilities to the existing features and then starts a new training procedure repeatedly to improve performance. We perform the experiments on the data set derived from the well-known database Protein Data Bank (PDB). We get 94.2% accuracy for predicting disulfide bonding state, which gets improvement 3.5 % compared with the previous best result 90.7%. Index Terms—disulfide bond; bioinfomatics; support vector machine; cysteine state prediction; I.
Private correspondence
, 1998
"... Molecular diversity and phylogeny of Hantaan virus in Guizhou, China: evidence for Guizhou as a radiation center of the present Hantaan virus ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
(Show Context)
Molecular diversity and phylogeny of Hantaan virus in Guizhou, China: evidence for Guizhou as a radiation center of the present Hantaan virus
Machine Learning Methods for Protein Structure Prediction
"... Abstract—Machine learning methods are widely used in bioinformatics and computational and systems biology. Here, we review the development of machine learning methods for protein structure prediction, one of the most fundamental problems in structural biology and bioinformatics. Protein structure pr ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
(Show Context)
Abstract—Machine learning methods are widely used in bioinformatics and computational and systems biology. Here, we review the development of machine learning methods for protein structure prediction, one of the most fundamental problems in structural biology and bioinformatics. Protein structure prediction is such a complex problem that it is often decomposed and attacked at four different levels: 1-D prediction of structural features along the primary sequence of amino acids; 2-D prediction of spatial relationships between amino acids; 3-D prediction of the tertiary structure of a protein; and 4-D prediction of the quaternary structure of a multiprotein complex. A diverse set of both supervised and unsupervised machine learning methods has been applied over the years to tackle these problems and has significantly contributed to advancing the state-of-the-art of protein structure prediction. In this paper, we review the development and application of hidden Markov models, neural networks, support vector machines, Bayesian methods, and clustering methods in 1-D, 2-D, 3-D, and 4-D protein structure predictions. Index Terms—Bioinformatics, machine learning, protein folding, protein structure prediction. I.
Comparative Analysis of Disulfide Bond Determination Using Computational-Predictive Methods and Mass Spectrometry-Based Algorithmic Approach”,
- CCIS
, 2008
"... Abstract. Identifying the disulfide bonding pattern in a protein is critical to understanding its structure and function. At the state-of-the-art, a large number of computational strategies have been proposed that predict the disulfide bonding pattern using sequence-level information. Recent past h ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
(Show Context)
Abstract. Identifying the disulfide bonding pattern in a protein is critical to understanding its structure and function. At the state-of-the-art, a large number of computational strategies have been proposed that predict the disulfide bonding pattern using sequence-level information. Recent past has also seen a spurt in the use of Mass spectrometric (MS) methods in proteomics. Mass spectrometrybased analysis can also be used to determine disulfide bonds. Furthermore, MS methods can work with lower sample purity when compared with x-ray crystallography or NMR. However, without the assistance of computational techniques, MS-based identification of disulfide bonds is time-consuming and complicated. In this paper we present an algorithmic solution to this problem and examine how the proposed method successfully deals with some of the key challenges in mass spectrometry. Using data from the analysis of nine eukaryotic Glycosyltransferases with varying numbers of cysteines and disulfide bonds we perform a detailed comparative analysis between the MS-based approach and a number of computational-predictive methods. These experiments highlight the tradeoffs between these classes of techniques and provide critical insights for further advances in this important problem domain.
Disease risk of missense mutations using structural inference from predicted function, Curr. Protein Pept
- Sci
, 2010
"... Abstract: Advancements in sequencing techniques place personalized genomic medicine upon the horizon, bringing along the responsibility of clinicians to understand the likelihood for a mutation to cause disease, and of scientists to separate etiology from nonpathologic variability. Pathogenicity is ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
Abstract: Advancements in sequencing techniques place personalized genomic medicine upon the horizon, bringing along the responsibility of clinicians to understand the likelihood for a mutation to cause disease, and of scientists to separate etiology from nonpathologic variability. Pathogenicity is discernable from patterns of interactions between a missense mutation, the surrounding protein structure, and intermolecular interactions. Physicochemical stability calculations are not accessible without structures, as is the case for the vast majority of human proteins, so diagnostic accuracy remains in infancy. To model the effects of missense mutations on functional stability without structure, we combine novel protein sequence analysis algorithms to discern spatial distributions of sequence, evolutionary, and physicochemical conservation, through a new approach to optimize component selection. Novel components include a combinatory substitution matrix and two heuristic algorithms that detect positions which confer structural support to interaction interfaces. The method reaches 0.91 AUC in ten-fold cross-validation to predict alteration of function for 6,392 in vitro mutations. For clinical utility we trained the method on 7,022 disease associated missense mutations within the Online Mendelian inheritance in man amongst a larger randomized set. In a blinded prospective test to delineate mutations unique to 186 patients with craniosynostosis from those in the 95 highly variant Coriell controls and 2000 control chromosomes, we achieved roughly 1/3 sensitivity and perfect specificity. The component algorithms retained during machine learning constitute novel protein sequence analysis techniques to describe environments supporting neutrality or pathology of mutations. This approach to pathogenetics enables new insight into the mechanistic relationship of missense mutations to disease phenotypes in our patients.
PERMISSION TO USE
"... In presenting this thesis in partial fulfillment of the requirements for a Postgraduate degree from the University of Saskatchewan, I agree that the Libraries of this University may make it freely available for inspection. I further agree that permission for copying of this thesis in any manner, in ..."
Abstract
- Add to MetaCart
In presenting this thesis in partial fulfillment of the requirements for a Postgraduate degree from the University of Saskatchewan, I agree that the Libraries of this University may make it freely available for inspection. I further agree that permission for copying of this thesis in any manner, in whole or in part, for scholarly purposes may be granted by the professor or professors who supervised my thesis work or, in their absence, by the Head of the Department or the Dean of the College in which my thesis work was done. It is understood that any copying or publication or use of this thesis or parts thereof for financial gain shall not be allowed without my written permission. It is also understood that due recognition shall be given to me and to the University of Saskatchewan in any scholarly use which may be made of any material in my thesis.