Results 1 - 10
of
26
BaCelLo: a balanced subcellular localization predictor
- Bioinformatics
, 2006
"... doi:10.1093/bioinformatics/btl222 ..."
(Show Context)
Identifying Cysteines and Histidines in Transition-Metal-Binding Sites Using Support Vector Machines and Neural Networks
"... ABSTRACT Accurate predictions of metal-binding sites in proteins by using sequence as the only source of information can significantly help in the prediction of protein structure and function, genome annotation, and in the experimental determination of protein structure. Here, we introduce a method ..."
Abstract
-
Cited by 17 (10 self)
- Add to MetaCart
ABSTRACT Accurate predictions of metal-binding sites in proteins by using sequence as the only source of information can significantly help in the prediction of protein structure and function, genome annotation, and in the experimental determination of protein structure. Here, we introduce a method for identifying histidines and cysteines that participate in binding of several transition metals and iron complexes. The method predicts histidines as being in either of two states (free or metal bound) and cysteines in either of three states (free, metal bound, or in disulfide bridges). The method uses only sequence information by utilizing position-specific evolutionary profiles as well as more global descriptors such as protein length and amino acid composition. Our solution is based on a two-stage machine-learning approach. The first stage consists of a support vector machine trained to locally classify the binding state of single histidines and cysteines. The second stage consists of a bidirectional recurrent neural network trained to refine local predictions by taking into account dependencies among residues within the same protein. A simple finite state automaton is employed as a postprocessing in the second stage in order to enforce an even number of disulfide-bonded cysteines. We predict histidines and cysteines in transition-metal-binding sites at 73% precision and 61 % recall. We observe significant differences in performance depending on the ligand (histidine or cysteine) and on the metal bound. We also predict cysteines participating in disulfide bridges at 86% precision and 87 % recall. Results are compared to those that would be obtained by using expert information as represented by PROSITE motifs and, for disulfide bonds, to state-of-the-art methods. Proteins 2006;
PredAlgo, a new subcellular localization prediction tool dedicated to green algae
- Mol Biol Evol
, 2012
"... R esearch reso u rce ..."
PSI: A Comprehensive and Integrative Approach for Accurate Plant Subcellular Localization Prediction
, 2013
"... Predicting the subcellular localization of proteins conquers the major drawbacks of high-throughput localization experiments that are costly and time-consuming. However, current subcellular localization predictors are limited in scope and accuracy. In particular, most predictors perform well on cert ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Predicting the subcellular localization of proteins conquers the major drawbacks of high-throughput localization experiments that are costly and time-consuming. However, current subcellular localization predictors are limited in scope and accuracy. In particular, most predictors perform well on certain locations or with certain data sets while poorly on others. Here, we present PSI, a novel high accuracy web server for plant subcellular localization prediction. PSI derives the wisdom of multiple specialized predictors via a joint-approach of group decision making strategy and machine learning methods to give an integrated best result. The overall accuracy obtained (up to 93.4%) was higher than best individual (CELLO) by,10.7%. The precision of each predicable subcellular location (more than 80%) far exceeds that of the individual predictors. It can also deal with multi-localization proteins. PSI is expected to be a powerful tool in protein location engineering as well as in plant sciences, while the strategy employed could be applied to other integrative problems. A user-friendly web server, PSI, has been developed for free access at
MPIC: A Mitochondrial Protein Import Components Database for Plant and Non-Plant Species
, 2014
"... In the 2 billion years since the endosymbiotic event that gave rise to mitochondria, variations in mitochondrial pro-tein import have evolved across different species. With the genomes of an increasing number of plant species sequenced, it is possible to gain novel insights into mito-chondrial prote ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
In the 2 billion years since the endosymbiotic event that gave rise to mitochondria, variations in mitochondrial pro-tein import have evolved across different species. With the genomes of an increasing number of plant species sequenced, it is possible to gain novel insights into mito-chondrial protein import pathways. We have generated the Mitochondrial Protein Import Components (MPIC) Database (DB;
Endosymbiotic Gene Transfer and Transcriptional Regulation of Transferred Genes in Paulinella chromatophora
"... Paulinella chromatophora is a cercozoan amoeba that contains ‘‘chromatophores,’ ’ which are photosynthetic inclusions of cyanobacterial origin. The recent discovery that chromatophores evolved independently of plastids, underwent major genome reduction, and transferred at least two genes to the host ..."
Abstract
- Add to MetaCart
Paulinella chromatophora is a cercozoan amoeba that contains ‘‘chromatophores,’ ’ which are photosynthetic inclusions of cyanobacterial origin. The recent discovery that chromatophores evolved independently of plastids, underwent major genome reduction, and transferred at least two genes to the host nucleus has highlighted P. chromatophora as a model to infer early steps in the evolution of photosynthetic organelles. However, owing to the paucity of nuclear genome sequence data, the extent of endosymbiotic gene transfer (EGT) and host symbiont regulation are currently unknown. A combination of 454 and Illumina next generation sequencing enabled us to generate a comprehensive reference transcriptome data set for P. chromatophora on which we mapped short Illumina cDNA reads generated from cultures from the dark and light phases of a diel cycle. Combined with extensive phylogenetic analyses of the deduced protein sequences, these data revealed that 1) about 0.3–0.8 % of the nuclear genes were obtained by EGT compared with 11–14% in the Plantae, 2) transferred genes show a distinct bias in that many encode small proteins involved in photosynthesis and photoacclimation, 3) host cells established control over expression of transferred genes, and 4) not only EGT, but to a minor extent also horizontal gene transfer from organisms that presumably served as food sources, helped to shape the nuclear genome of P. chromatophora. The identification of a significant number of transferred genes involved in photosynthesis and photoacclimation of thylakoid membranes as well as the observed transcriptional regulation of these
Research resource PredAlgo: A New Subcellular Localization Prediction Tool Dedicated to Green Algae
"... Abstract The unicellular green alga Chlamydomonas reinhardtii is a prime model for deciphering processes occurring in the intracellular compartments of the photosynthetic cell. Organelle-specific proteomic studies have started to delineate its various subproteomes, but sequence-based prediction sof ..."
Abstract
- Add to MetaCart
Abstract The unicellular green alga Chlamydomonas reinhardtii is a prime model for deciphering processes occurring in the intracellular compartments of the photosynthetic cell. Organelle-specific proteomic studies have started to delineate its various subproteomes, but sequence-based prediction software is necessary to assign proteins subcellular localizations at whole genome scale. Unfortunately, existing tools are oriented toward land plants and tend to mispredict the localization of nuclear-encoded algal proteins, predicting many chloroplast proteins as mitochondrion targeted. We thus developed a new tool called PredAlgo that predicts intracellular localization of those proteins to one of three intracellular compartments in green algae: the mitochondrion, the chloroplast, and the secretory pathway. At its core, a neural network, trained using carefully curated sets of C. reinhardtii proteins, divides the N-terminal sequence into overlapping 19-residue windows and scores the probability that they belong to a cleavable targeting sequence for one of the aforementioned organelles. A targeting prediction is then deduced for the protein, and a likely cleavage site is predicted based on the shape of the scoring function along the N-terminal sequence. When assessed on an independent benchmarking set of C. reinhardtii sequences, PredAlgo showed a highly improved discrimination capacity between chloroplast-and mitochondrion-localized proteins. Its predictions matched well the results of chloroplast proteomics studies. When tested on other green algae, it gave good results with Chlorophyceae and Trebouxiophyceae but tended to underpredict mitochondrial proteins in Prasinophyceae. Approximately 18% of the nuclear-encoded C. reinhardtii proteome was predicted to be targeted to the chloroplast and 15% to the mitochondrion.
unknown title
, 2009
"... Going from where to why—interpretable prediction of protein subcellular localization ..."
Abstract
- Add to MetaCart
(Show Context)
Going from where to why—interpretable prediction of protein subcellular localization
family: acyl-CoA dehydrogenases
, 2009
"... and dispersal of a ubiquitous protein ..."
(Show Context)
PREDICTION OF PROTEIN SUBCELLULAR LOCALIZATION: A MACHINE LEARNING APPROACH
"... Over the years, large-scale genomic and proteomic efforts have produced large amounts of sequence data. One of the key challenges in the post-genomic era is to predict functions and roles of gene products. Proteins are essential to the structure and function of all living cells, and many of the them ..."
Abstract
- Add to MetaCart
(Show Context)
Over the years, large-scale genomic and proteomic efforts have produced large amounts of sequence data. One of the key challenges in the post-genomic era is to predict functions and roles of gene products. Proteins are essential to the structure and function of all living cells, and many of the them are enzymes or subunits of enzymes that catalyze chemical reactions.