Results 1 - 10
of
64
Protein homology detection by HMM-HMM comparison
- BIOINFORMATICS
, 2005
"... Motivation: Protein homology detection and sequence alignment are at the basis of protein structure prediction, function prediction, and evolution. Results: We have generalized the alignment of protein se-quences with a profile hidden Markov model (HMM) to the case of pairwise alignment of profile H ..."
Abstract
-
Cited by 401 (8 self)
- Add to MetaCart
(Show Context)
Motivation: Protein homology detection and sequence alignment are at the basis of protein structure prediction, function prediction, and evolution. Results: We have generalized the alignment of protein se-quences with a profile hidden Markov model (HMM) to the case of pairwise alignment of profile HMMs. We present a method for detecting distant homologous relationships between proteins based on this approach. The method (HHsearch) is benchmarked together with BLAST, PSI-BLAST, HMMER, and the profile-profile comparison tools PROF_SIM and COMPASS, in an all-against-all compari-son of a database of 3691 protein domains from SCOP 1.63 with pairwise sequence identities below 20%. Sensitivity: When predicted secondary structure is included in the HMMs, HHsearch is able to detect between 2.7 and 4.2 times more homologs than PSI-BLAST or HMMER and between 1.44 and 1.9 times more than COMPASS or PROF_SIM for a rate of false positives of 10%. Approxi-mately half of the improvement over the profile–profile com-parison methods is attributable to the use of profile HMMs in place of simple profiles. Alignment quality: Higher sensitivity is mirrored by an in-creased alignment quality. HHsearch produced 1.2, 1.7, and 3.3 times more good alignments (“balanced ” score> 0.3) than the next best method (COMPASS), and 1.6, 2.9, and 9.4 times more than PSI-BLAST, at the family, super-family, and fold level. Speed: HHsearch scans a query of 200 residues against 3691 domains in 33s on an AMD64 3GHz PC. This is 10 times faster than PROF_SIM and 17 times faster than
PROBCONS: Probabilistic consistency-based multiple sequence alignment
- Genome Res
, 2005
"... To study gene evolution across a wide range of organisms, biologists need accurate tools for multiple sequence alignment of protein families. Obtaining accurate alignments, however, is a difficult computational problem because of not only the high computational cost but also the lack of proper objec ..."
Abstract
-
Cited by 256 (10 self)
- Add to MetaCart
(Show Context)
To study gene evolution across a wide range of organisms, biologists need accurate tools for multiple sequence alignment of protein families. Obtaining accurate alignments, however, is a difficult computational problem because of not only the high computational cost but also the lack of proper objective functions for measuring alignment quality. In this paper, we introduce prob-abilistic consistency, a novel scoring function for multiple sequence comparisons. We present PROBCONS, a practical tool for progressive protein multiple sequence alignment based on prob-abilistic consistency, and evaluate its performance on several standard alignment benchmark datasets. On the BAliBASE, SABmark, and PREFAB benchmark alignment databases, PROB-CONS achieves statistically significant improvement over other leading methods while maintain-ing practical speed. PROBCONS is publicly available as a web resource. Source code and execu-tables are available under the GNU Public License at
Within the Twilight Zone: A Sensitive Profile-Profile Comparison Tool Based on Information Theory
- J. Mol. Biol
, 2002
"... This paper presents a novel approach to prole-prole comparison. The method compares two input proles (like those that are generated by PSI-BLAST) and assigns a similarity score to assess their statistical similarity. Our prole-prole comparison tool, which allows for gaps, can be used to detect weak ..."
Abstract
-
Cited by 147 (4 self)
- Add to MetaCart
This paper presents a novel approach to prole-prole comparison. The method compares two input proles (like those that are generated by PSI-BLAST) and assigns a similarity score to assess their statistical similarity. Our prole-prole comparison tool, which allows for gaps, can be used to detect weak similarities between protein families. It has also been optimized to produce alignments that are in very good agreement with structural alignments. Tests show that the prole-prole alignments are indeed highly correlated with similarities between secondary structure elements and tertiary structure. Exhaustive evaluations show that our method is signicantly more sensitive in detecting distant homologies than the popular prole-based search programs PSI-BLAST and IMPALA. The relative improvement is the same order of magnitude as the improvement of PSI-BLAST relative to BLAST. Our new tool often detects similarities that fall within the twilight zone of sequence similarity
Pcons: A neural-network-based consensus predictor that improves fold recognition
- Protein Sci
, 2001
"... improves fold recognition ..."
(Show Context)
Alignment of protein sequences by their profiles
, 2004
"... The accuracy of an alignment between two protein sequences can be improved by including other detectably related sequences in the comparison. We optimize and benchmark such an approach that relies on aligning two multiple sequence alignments, each one including one of the two protein sequences. Thir ..."
Abstract
-
Cited by 69 (14 self)
- Add to MetaCart
(Show Context)
The accuracy of an alignment between two protein sequences can be improved by including other detectably related sequences in the comparison. We optimize and benchmark such an approach that relies on aligning two multiple sequence alignments, each one including one of the two protein sequences. Thirteen different protocols for creating and comparing profiles corresponding to the multiple sequence alignments are implemented in the SALIGN command of MODELLER. A test set of 200 pairwise, structure-based align-ments with sequence identities below 40 % is used to benchmark the 13 protocols as well as a number of previously described sequence alignment methods, including heuristic pairwise sequence alignment by BLAST, pairwise sequence alignment by global dynamic programming with an affine gap penalty function by the ALIGN command of MODELLER, sequence-profile alignment by PSI-BLAST, Hidden Markov Model methods implemented in SAM and LOBSTER, pairwise sequence alignment relying on predicted local structure by SEA, and multiple sequence alignment by CLUSTALW and COMPASS. The alignment accuracies of the best new protocols were significantly better than those of the other tested methods. For example, the fraction of the correctly aligned residues relative to the structure-based alignment by the best protocol is 56%, which can be compared with the accuracies of 26%, 42%, 43%, 48%, 50%, 49%, 43%, and 43 % for the other methods, respectively. The new method is currently applied to large-scale comparative protein structure modeling of all known sequences.
Cyclic coordinate descent: A robotics algorithm for protein loop closure, Protein Sci
"... service This article cites 34 articles, 12 of which can be accessed free at: ..."
Abstract
-
Cited by 65 (0 self)
- Add to MetaCart
(Show Context)
service This article cites 34 articles, 12 of which can be accessed free at:
Including biological literature improves homology search,”
- in Proceedings of the Pacific Symposium on Biocomputing,
, 2001
"... ..."
COACH: profile-profile alignment of protein families using hidden Markov models
- BIOINFORMATICS
, 2004
"... ..."
Sequence variations within protein families are linearly related to structural variations
- J. Mol. Biol
, 2002
"... A protein sequence folds into a unique threedimensional structure. Interestingly, this one-toone correspondence is no longer valid when all proteins are considered. The size of the protein ..."
Abstract
-
Cited by 18 (0 self)
- Add to MetaCart
A protein sequence folds into a unique threedimensional structure. Interestingly, this one-toone correspondence is no longer valid when all proteins are considered. The size of the protein
Multiple sequence alignment with evolutionary computation,”
- Genetic Programming and Evolvable Machines,
, 2004
"... Abstract. In this paper we provide a brief review of current work in the area of multiple sequence alignment (MSA) for DNA and protein sequences using evolutionary computation (EC). We detail the strengths and weaknesses of EC techniques for MSA. In addition, we present two novel approaches for inf ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
(Show Context)
Abstract. In this paper we provide a brief review of current work in the area of multiple sequence alignment (MSA) for DNA and protein sequences using evolutionary computation (EC). We detail the strengths and weaknesses of EC techniques for MSA. In addition, we present two novel approaches for inferring MSA using genetic algorithms. Our first novel approach utilizes a GA to evolve an optimal guide tree in a progressive alignment algorithm and serves as an alternative to the more traditional heuristic techniques such as neighbor-joining. The second novel approach facilitates the optimization of a consensus sequence with a GA using a vertically scalable encoding scheme in which the number of iterations needed to find the optimal solution is approximately the same regardless the number of sequences being aligned. We compare both of our novel approaches to the popular progressive alignment program Clustal W. Experiments have confirmed that EC constitutes an attractive and promising alternative to traditional heuristic algorithms for MSA.