Results 11 - 20
of
470
Approximate Protein Structural Alignment in Polynomial Time
- Proc. Natl Acad. Sci. USA
, 2004
"... Alignment of protein structures is a fundamental task in computational molecular biology. Good structural alignments can help detect distant evolutionary relationships that are hard or impossible to discern from protein sequences alone. Here, we study the structural alignment problem as a family of ..."
Abstract
-
Cited by 61 (1 self)
- Add to MetaCart
Alignment of protein structures is a fundamental task in computational molecular biology. Good structural alignments can help detect distant evolutionary relationships that are hard or impossible to discern from protein sequences alone. Here, we study the structural alignment problem as a family of optimization problems and develop an approximate polynomial time algorithm to solve them. For a commonly used scoring function, the algorithm runs in O(n ) time, for globular protein of length n, when we wish to detect all scores that are at most # distance away from the optimum. We argue that such approximate solutions are, in fact, of greater interest than exact ones, due to the noisy nature of experimentally determined protein coordinates. The measurement of similarity between a pair of protein structures used by the algorithm involves the Euclidean distance between the structures, after rigidly transforming them. We show that an alternative approach, which relies on internal distance matrices, must incorporate sophisticated geometric ingredients in order to both guarantee optimality and run in polynomial time. We use these observations to visualize the scoring function for several real instances of the problem. Our investigations yield new insights on the computational complexity of protein alignment under various scoring functions. These insights can be used in the design of new scoring functions for which the optimum can be approximated e#ciently, and perhaps in the development of e#cient algorithms for the multiple structural alignment problem.
Fold change in evolution of protein structures
- J. Struct. Biol
, 2001
"... Typically, protein spatial structures are more conserved in evolution than amino acid sequences. However, the recent explosion of sequence and structure information accompanied by the development of powerful computational methods led to the accumulation of examples of homologous proteins with global ..."
Abstract
-
Cited by 58 (5 self)
- Add to MetaCart
(Show Context)
Typically, protein spatial structures are more conserved in evolution than amino acid sequences. However, the recent explosion of sequence and structure information accompanied by the development of powerful computational methods led to the accumulation of examples of homologous proteins with globally distinct structures. Significant sequence conservation, local structural resemblance, and functional similarity strongly indicate evolutionary relationships between these proteins despite pronounced structural differences at the fold level. Several mechanisms such as insertions/deletions/substitutions, circular permutations, and rearrangements in �-sheet topologies account for the majority of detected structural irregularities. The existence of evolutionarily related proteins that possess different folds brings new challenges to the homology modeling techniques and the structure classification strategies and offers new opportunities for protein design in experimental studies. © 2001 Academic Press Key Words: circular permutation; insertion; deletion; molecular evolution; protein structure classification; conformational change; homology modeling.
Patterns of protein-fold usage in eight microbial genomes: a comprehensive structural census
- Proteins
, 1998
"... ..."
Intrinsic Protein Disorder in Complete Genomes
, 2000
"... Intrinsic protec disorde reord toseDO: ts or towhole prote%k that fail to foldcomplejDE onthej own. HeE we preO:kEb disorde onproteO seteO[E from 34gek:]O[ including 22 bacteDk[ 7 archae% and 5eD:%# ote% Pre%#8Eb disorde8] sesor ts # 50, # 40, and # 30 in leEO% we% deD#:Eb8[ as weD asprote#k e ..."
Abstract
-
Cited by 53 (8 self)
- Add to MetaCart
Intrinsic protec disorde reord toseDO: ts or towhole prote%k that fail to foldcomplejDE onthej own. HeE we preO:kEb disorde onproteO seteO[E from 34gek:]O[ including 22 bacteDk[ 7 archae% and 5eD:%# ote% Pre%#8Eb disorde8] sesor ts # 50, # 40, and # 30 in leEO% we% deD#:Eb8[ as weD asprote#k eote#kE to be whollydisordeD]: The five eEkO] otewe] seD#Eb8# frombacte#D and archae by havingthe highe: pe8j[ tage ofsekDkkkE preDkkkE to have disorde8j sesor ts # 50 inleDk]: from 25% for Plasmodium to 41% for Drosophila. Estimate of whollydisorde:% protede inthe bactej[ range from 1% to 8%, avek8Oj% to 32%, e%,EO%D8 in various archae range from 2 to 11%, plus anappare tly anomalous 18%, avek[%%O to 75% that drops to 53% ifthe high value isdiscarde8 Estimate inthe 5ekD%[ arange from 3 to 17%.The putative whollydisorde:k protede we8 ofte ribosomalproteEk: but in addition abouteE%% numbe% we% of known and unknown function. OveE8jk intrinsic disorde app to be a common, withethE% ote peDDO# having ahighe pe888 tage of native disorde than archae or bacte#Eb Keywords: intrinsicdisorde% prede%O]: structuralgeructu 1
Network analysis of protein structures identifies functional residues
- J. Mol. Biol
, 2004
"... Identifying active site residues strictly from protein three-dimensional structure is a difficult task, especially for proteins that have few or no homologues. We transformed protein structures into residue interaction graphs (RIGs), where amino acid residues are graph nodes and their interactions w ..."
Abstract
-
Cited by 44 (0 self)
- Add to MetaCart
Identifying active site residues strictly from protein three-dimensional structure is a difficult task, especially for proteins that have few or no homologues. We transformed protein structures into residue interaction graphs (RIGs), where amino acid residues are graph nodes and their interactions with each other are the graph edges. We found that active site, ligand-binding and evolutionary conserved residues, typically have high closeness values. Residues with high closeness values interact directly or by a few intermediates with all other residues of the protein. Combining closeness and surface accessibility identified active site residues in 70 % of 178 representative structures. Detailed structural analysis of specific enzymes also located other types of functional residues. These include the substrate binding sites of acetylcholinesterases and subtilisin, and the regions whose structural changes activate MAP kinase and glycogen phosphorylase. Our approach uses single protein structures, and does not rely on sequence conservation, comparison to other similar structures or any prior knowledge. Residue closeness is distinct from various sequence and structure measures and can thus complement them in identifying key protein residues. Closeness integrates the effect of the entire protein on single residues. Such natural structural design may be evolutionary maintained to preserve interaction redundancy and contribute to optimal setting of functional sites.
Proteins with Similar Architecture Exhibit Similar Large-Scale
- Biophys. J
, 2000
"... We have investigated the similarities and differences in the computed dynamic fluctuations exhibited by six members of a protein fold family with a coarse-grained Gaussian network model. Specifically, we consider the cofactor binding fragment of CysB; the lysine/arginine/ornithine-binding protein (L ..."
Abstract
-
Cited by 44 (9 self)
- Add to MetaCart
We have investigated the similarities and differences in the computed dynamic fluctuations exhibited by six members of a protein fold family with a coarse-grained Gaussian network model. Specifically, we consider the cofactor binding fragment of CysB; the lysine/arginine/ornithine-binding protein (LAO); the enzyme porphobilinogen deaminase (PBGD); the ribose-binding protein (RBP); the N-terminal lobe of ovotransferrin in apo-form (apo-OVOT); and the leucine/ isoleucine/valine-binding protein (LIVBP). All have domains that resemble a Rossmann fold, but there are also some significant differences. Results indicate that similar global dynamic behavior is preserved for the members of a fold family, and that differences usually occur in regions only where specific function is localized. The present work is a computational demonstration that the scaffold of a protein fold may be utilized for diverse purposes. LAO requires a bound ligand before it conforms to the large-scale fluctuation behavior of the three other members of the family, CysB, PBGD, and RBP, all of which contain a substrate (cofactor) at the active site cleft. The dynamics of the ligand-free enzymes LIVBP and apo-OVOT, on the other hand, concur with that of unliganded LAO. The present results suggest that it is possible to construct structure alignments based on dynamic fluctuation behavior.
Target space for structural genomics revisited
, 2002
"... Motivation: Structural genomics eventually aims at determining structures for all proteins. However, in the beginning experimentalists are likely to focus on globular proteins to achieve a rapid basic coverage of protein sequence space. How many proteins will structural genomics have to target? How ..."
Abstract
-
Cited by 43 (14 self)
- Add to MetaCart
Motivation: Structural genomics eventually aims at determining structures for all proteins. However, in the beginning experimentalists are likely to focus on globular proteins to achieve a rapid basic coverage of protein sequence space. How many proteins will structural genomics have to target? How many proteins will be excluded since we already have structural information for these or since they are not globular? We have to answer these questions in the context of our target selection for the North-East Structural Genomics Consortium (NESG). Results: We estimated that structural information is available for about 6–38 % of all proteins; 6 % if we require high accuracy in comparative modelling, 38 % if we are satisfied with having a rough idea about the fold. Excluding all regions that are not globular, we found that structural genomics may have to target about 48 % of all proteins. This corresponded to a similar percentage of residues of the entire proteomes (52%). We explored a number of different strategies to cluster protein space in order to find the number of families representing these 48 % of structurally unknown proteins. For the subset of all entirely sequenced eukaryotes, we found over 18 000 fragment clusters each of which may be a suitable target for structural genomics. Availability: All data are available from the authors, most results are summarized at:
SPINE: an integrated tracking database and data mining approach for identifying feasible targets in high-throughput structural proteomics.
- Nucleic Acids Res.
, 2001
"... ..."
CONTRAlign: discriminative training for protein sequence alignment
- In: International Conference in Research on Computational Molecular Biology (RECOMB). (2006
, 2006
"... 1 Introduction In comparative structural biology studies, analyzing or predicting protein three-dimensional structure often begins with identifying patterns of amino acid substitution via protein sequence alignment. While the evolutionary informationobtained from alignments can provide insights into ..."
Abstract
-
Cited by 36 (4 self)
- Add to MetaCart
(Show Context)
1 Introduction In comparative structural biology studies, analyzing or predicting protein three-dimensional structure often begins with identifying patterns of amino acid substitution via protein sequence alignment. While the evolutionary informationobtained from alignments can provide insights into protein structure, constructing accurate alignments may be difficult when proteins share significant struc-tural similarity but little sequence similarity. Indeed, for modern alignment tools, alignment quality drops rapidly when the sequences compared have lower than25 % identity, the "twilight zone " of protein alignment [1].
Identifying DNA-binding proteins using structural motifs and the electrostatic potential
- Nucleic Acids Res
, 2004
"... Robust methods to detect DNA-binding proteins from structures of unknown function are important for structural biology. This paper describes a method for identifying such proteins that (i) have a solvent accessible structural motif necessary for DNA-binding and (ii) a positive electrostatic potentia ..."
Abstract
-
Cited by 35 (1 self)
- Add to MetaCart
(Show Context)
Robust methods to detect DNA-binding proteins from structures of unknown function are important for structural biology. This paper describes a method for identifying such proteins that (i) have a solvent accessible structural motif necessary for DNA-binding and (ii) a positive electrostatic potential in the region of the binding region. We focus on three structural motifs: helix–turn-helix (HTH), helix– hairpin–helix (HhH) and helix–loop–helix (HLH). We find that the combination of these variables detect 78 % of proteins with an HTH motif, which is a substan-tial improvement over previous work based purely on structural templates and is comparable to more complex methods of identifying DNA-binding pro-teins. Similar true positive fractions are achieved for the HhH and HLH motifs. We see evidence of wide evolutionary diversity for DNA-binding proteins with an HTH motif, and much smaller diversity for those with an HhH or HLH motif.