Results 1 - 10
of
2,288
A combined transmembrane topology and signal peptide prediction method
- J. Mol. Biol
, 2004
"... Hidden Markov models (HMMs) have been successfully applied to the tasks of transmembrane protein topology prediction and signal peptide prediction. In this paper we expand upon this work by making use of the more powerful class of dynamic Bayesian networks (DBNs). Our model, Philius, is inspired by ..."
Abstract
-
Cited by 233 (10 self)
- Add to MetaCart
Hidden Markov models (HMMs) have been successfully applied to the tasks of transmembrane protein topology prediction and signal peptide prediction. In this paper we expand upon this work by making use of the more powerful class of dynamic Bayesian networks (DBNs). Our model, Philius, is inspired by a previously published HMM, Phobius, and combines a signal peptide submodel with a transmembrane submodel. We introduce a two-stage DBN decoder that combines the power of posterior decoding with the grammar constraints of Viterbi-style decoding. Philius also provides protein type, segment, and topology confidence metrics to aid in the interpretation of the predictions. We report a relative improvement of 13 % over Phobius in full-topology prediction accuracy on transmembrane proteins, and a sensitivity and specificity of 0.96 in detecting signal peptides. We also show that our confidence metrics correlate well with the observed precision. In addition, we have made predictions on all 6.3 million proteins in the Yeast Resource Center (YRC) database. This large-scale study provides an overall picture of the relative numbers of proteins that include a signal-peptide and/or one or more transmembrane segments as well as a valuable resource for the scientific community. All DBNs are implemented using the Graphical Models Toolkit. Source code for the models described here is available at
Mammalian transcription factor ATF6 is synthesized as a transmembrane protein and activated by proteolysis in response to endoplasmic reticulum stress
- Mol. Biol. Cell
, 1999
"... The unfolded protein response (UPR) controls the levels of molecular chaperones and enzymes involved in protein folding in the endoplasmic reticulum (ER). We recently isolated ATF6 as a candidate for mammalian UPR-specific transcription factor. We report here that ATF6 constitutively expressed as a ..."
Abstract
-
Cited by 206 (9 self)
- Add to MetaCart
The unfolded protein response (UPR) controls the levels of molecular chaperones and enzymes involved in protein folding in the endoplasmic reticulum (ER). We recently isolated ATF6 as a candidate for mammalian UPR-specific transcription factor. We report here that ATF6 constitutively expressed as a 90-kDa protein (p90ATF6) is directly converted to a 50-kDa protein (p50ATF6) in ER-stressed cells. Furthermore, we showed that the most important consequence of this conversion was altered subcellular localization; p90ATF6 is embedded in the ER, whereas p50ATF6 is a nuclear protein. p90ATF6 is a type II transmembrane glycoprotein with a hydrophobic stretch in the middle of the molecule. Thus, the N-terminal half containing a basic leucine zipper motif is oriented facing the cytoplasm. Full-length ATF6 as well as its C-terminal deletion mutant carrying the transmembrane domain is localized in the ER when transfected. In contrast, mutant ATF6 representing the cytoplasmic region translocates into the nucleus and activates transcription of the endogenous GRP78/BiP gene. We propose that ER stress-induced proteolysis of membrane-bound p90ATF6 releases soluble p50ATF6, leading to induced transcription in the nucleus. Unlike yeast UPR, mammalian UPR appears to use a system similar to that reported for cholesterol homeostasis.
Prediction of the coding sequences of unidentified human genes. VI. The coding sequences of 80 new genes (KIAA0201KIAA0280) deduced by analysis of cDNA clones from cell line KG-1 and brain
- DNA Res
, 1996
"... In this series of projects of sequencing human cDNA clones which correspond to relatively long transcripts, we newly determined the entire sequences of 100 cDNA clones which were screened on the basis of the potentiality of coding for large proteins in vitro. The cDNA libraries used were the fractio ..."
Abstract
-
Cited by 194 (15 self)
- Add to MetaCart
In this series of projects of sequencing human cDNA clones which correspond to relatively long transcripts, we newly determined the entire sequences of 100 cDNA clones which were screened on the basis of the potentiality of coding for large proteins in vitro. The cDNA libraries used were the fractions with average insert sizes from 5.3 to 7.0 kb of the size-fractionated cDNA libraries from human brain. The randomly sampled clones were single-pass sequenced from both the ends to select clones that are not registered in the public database. Then their protein-coding potentialities were examined by an in vitro transcription/translation system, and the clones that generated proteins larger than 60 kDa were entirely sequenced. Each clone gave a distinct open reading frame (ORF), and the length of the ORF was roughly coincident with the approximate molecular mass of the in vitro product estimated from its mobility on SDS-polyacrylamide gel electrophoresis. The average size of the cDNA clones sequenced was 6.1 kb, and that of the ORFs corresponded to 1200 amino acid residues. By computer-assisted analysis of the sequences with DNA and protein-motif databases (GenBank and PROSITE databases), the functions of at least 73% of the gene products could be anticipated, and 88 % of them (the products of 64 clones) were assigned to the functional categories of proteins relating to cell signaling/communication, nucleic acid managing,
Intrinsically disordered protein
- J. Mol. Graph. Model
, 2001
"... Dunker K., et al Proteins can exist in a trinity of structures: the ordered state, the molten globule and the random coil. Five examples follow which suggest that native protein structure can correspond to any of the three states (not just the ordered state) and that protein function can arise from ..."
Abstract
-
Cited by 160 (16 self)
- Add to MetaCart
Dunker K., et al Proteins can exist in a trinity of structures: the ordered state, the molten globule and the random coil. Five examples follow which suggest that native protein structure can correspond to any of the three states (not just the ordered state) and that protein function can arise from any of the three states and their transitions. 1. In a process that likely mimics infection, fd phage converts from the ordered into the disordered molten globular state. 2. Nucleosome hyperacetylation is crucial to DNA replication and transcription; this chemical modification greatly increases the net negative charge of the nucleosome core particle. We propose that the increased charge imbalance promotes its conversion to a much less rigid form. 3. Clusterin contains an ordered domain and also a native molten globular region. The molten globular domain likely functions as a proteinaceous detergent for cell remodeling and removal of apoptotic debris. 4. In a critical signaling event, a helix in calcineurin becomes bound and surrounded by calmodulin, thereby
Miyano S: Extensive feature detection of N-terminal protein sorting signals
- Bioinformatics
"... Motivation: The prediction of localization sites of various proteins is an important and challenging problem in the field of molecular biology. TargetP, by Emanuelsson et al. (2000) is a neural network based system which is currently the best predictor in the literature for N-terminal sorting signal ..."
Abstract
-
Cited by 136 (6 self)
- Add to MetaCart
(Show Context)
Motivation: The prediction of localization sites of various proteins is an important and challenging problem in the field of molecular biology. TargetP, by Emanuelsson et al. (2000) is a neural network based system which is currently the best predictor in the literature for N-terminal sorting signals. One drawback of neural networks, however, is that it is generally difficult to understand and interpret how and why they make such predictions. In this paper, we aim to generate simple and interpretable rules as predictors, and still achieve a practical prediction accuracy. We adopt an approach which consists of an extensive search for simple rules and various attributes which is partially guided by human intuition. Results: We have succeeded in finding rules whose prediction accuracies come close to that of TargetP, while still retaining a very simple and interpretable form. We also discuss and interpret the discovered rules. Availability: An (experimental) web service using rules obtained by our method is provided at
Structure and organization of the hepatitis C virus genome isolated from human carriers
- J
"... isolated from human carriers. ..."
Flexible sequence similarity searching with the FASTA3 program package
- Methods Mol. Biol
, 2000
"... Since the publication of the first rapid method for comparing biological sequences 15 years ago (1), DNA and protein sequence comparison have become routine steps in biochemical characterization, from newly cloned proteins to entire genomes. As the DNA and protein sequence databases become more comp ..."
Abstract
-
Cited by 124 (3 self)
- Add to MetaCart
(Show Context)
Since the publication of the first rapid method for comparing biological sequences 15 years ago (1), DNA and protein sequence comparison have become routine steps in biochemical characterization, from newly cloned proteins to entire genomes. As the DNA and protein sequence databases become more complete, a sequence similarity search is more likely to reveal
et al. Gene expression patterns in human liver cancers
- Mol Biol Cell
"... Hepatocellular carcinoma (HCC) is a leading cause of death worldwide. Using cDNA microarrays to characterize patterns of gene expression in HCC, we found consistent differences between the expression patterns in HCC compared with those seen in nontumor liver tissues. The expression patterns in HCC w ..."
Abstract
-
Cited by 117 (4 self)
- Add to MetaCart
(Show Context)
Hepatocellular carcinoma (HCC) is a leading cause of death worldwide. Using cDNA microarrays to characterize patterns of gene expression in HCC, we found consistent differences between the expression patterns in HCC compared with those seen in nontumor liver tissues. The expression patterns in HCC were also readily distinguished from those associated with tumors metastatic to liver. The global gene expression patterns intrinsic to each tumor were sufficiently distinctive that multiple tumor nodules from the same patient could usually be recognized and distinguished from all the others in the large sample set on the basis of their gene expression patterns alone. The distinctive gene expression patterns are characteristic of the tumors and not the patient; the expression programs seen in clonally independent tumor nodules in the same patient were no more similar than those in tumors from different patients. Moreover, clonally related tumor masses that showed distinct expression profiles were also distinguished by genotypic differences. Some features of the gene expression patterns were associated with specific phenotypic and genotypic characteristics of the tumors, including growth rate, vascular invasion, and p53 overexpression.
A functional-phylogenetic classification system for transmembrane solute transporters
, 2000
"... Updated information and services can be found at: ..."
Abstract
-
Cited by 113 (11 self)
- Add to MetaCart
(Show Context)
Updated information and services can be found at: