Results 1 - 10
of
36
Kalign, Kalignvu and Mumsa: web servers for multiple sequence alignment
- Nucleic Acids Res
, 2006
"... Obtaining high quality multiple alignments is crucial for a range of sequence analysis tasks. A common strategy is to align the sequences several times, vary-ing the program or parameters until the best align-ment according to manual inspection by human experts is found. Ideally, this should be assi ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
(Show Context)
Obtaining high quality multiple alignments is crucial for a range of sequence analysis tasks. A common strategy is to align the sequences several times, vary-ing the program or parameters until the best align-ment according to manual inspection by human experts is found. Ideally, this should be assisted by an automatic assessment of the alignment quality. Our web-site
JD: Strategies for reliable exploitation of evolutionary concepts in high throughput biology. Evol Bioinform Online 2008
"... high throughput biology. ..."
(Show Context)
Model-based prediction of sequence alignment quality
- Bioinformatics
, 2008
"... Motivation: Multiple sequence alignment (MSA) is an essential pre-requisite for many sequence analysis methods and valuable tool itself for describing relationships between protein sequences. Since the success of the sequence analysis is highly dependent on the relia-bility of alignments, measures f ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
(Show Context)
Motivation: Multiple sequence alignment (MSA) is an essential pre-requisite for many sequence analysis methods and valuable tool itself for describing relationships between protein sequences. Since the success of the sequence analysis is highly dependent on the relia-bility of alignments, measures for assessing the quality of alignments are highly requisite. Results: We present a statistical model-based alignment quality score. Unlike other quality scores, it does not require several par-allel alignments for the same set of sequences or additional structural information. Our quality score is based on measuring the conserva-tion level of reference alignments in Homstrad database. Reference sequences were re-aligned with the Mafft, Muscle and Probcons alignment programs, and a sum-of-pairs (SP) score was used to measure the quality of the re-alignments. Statistical modelling of the SP score as a function of conservation level and other alignment cha-racteristics makes it possible to predict the SP score for any global MSA. The predicted SP scores are highly correlated with the correct SP scores, when tested on the Homstrad and SABmark databases. The results are comparable to that of MOS and better than those of NorMD and NiRMSD alignment quality criteria. Furthermore, the pre-dicted SP score is able to detect alignments with badly aligned or unrelated sequences.
COMET: adaptive context-based modeling for ultrafast HIV-1 subtype identification
, 2014
"... Viral sequence classification has wide applications in clinical, epidemiological, structural and functional categorization studies. Most existing approaches rely on an initial alignment step followed by clas-sification based on phylogenetic or statistical algo-rithms. Here we present an ultrafast al ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
Viral sequence classification has wide applications in clinical, epidemiological, structural and functional categorization studies. Most existing approaches rely on an initial alignment step followed by clas-sification based on phylogenetic or statistical algo-rithms. Here we present an ultrafast alignment-free subtyping tool for human immunodeficiency virus type one (HIV-1) adapted from Prediction by Partial Matching compression. This tool, named COMET, was compared to the widely used phylogeny-based REGA and SCUEAL tools using synthetic and clinical HIV data sets (1 090 698 and 10 625 sequences, re-spectively). COMET’s sensitivity and specificity were comparable to or higher than the two other subtyping tools on both data sets for known subtypes. COMET also excelled in detecting and identifying new recom-binant forms, a frequent feature of the HIV epidemic. Runtime comparisons showed that COMET was al-most as fast as USEARCH. This study demonstrates the advantages of alignment-free classification of vi-ral sequences, which feature high rates of variation, recombination and insertions/deletions. COMET is free to use via an online interface.
Learning Parameter Sets for Alignment Advising
"... While the multiple sequence alignment output by an aligner strongly depends on the parameter values used for the alignment scoring function (such as the choice of gap penalties and substitution scores), most users rely on the single default parameter setting provided by the aligner. A different para ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
(Show Context)
While the multiple sequence alignment output by an aligner strongly depends on the parameter values used for the alignment scoring function (such as the choice of gap penalties and substitution scores), most users rely on the single default parameter setting provided by the aligner. A different parameter setting, however, might yield a much higher-quality alignment for the specific set of input sequences. The problem of picking a good choice of parameter values for specific input sequences is called parameter advising. A parameter advisor has two ingredients: (i) a set of parameter choices to select from, and (ii) an estimator that provides an estimate of the accuracy of the alignment computed by the aligner using a parameter choice. The parameter advisor picks the parameter choice from the set whose resulting alignment has highest estimated accuracy. We consider for the first time the problem of learning the optimal set of parameter choices for a parameter advisor that uses a given accuracy estimator. The optimal set is one that maximizes the expected true accuracy of the resulting parameter advisor, averaged over a collection of training data. While we prove that learning an optimal set for an advisor is NP-complete, we show there is a natural approximation algorithm for this problem, and prove a tight bound on its approximation ratio. Experiments with an implementation of this approximation algorithm on biological benchmarks, using various accuracy estimators from the literature, show it finds sets for advisors that are surprisingly close to optimal. Furthermore, the resulting parameter advisors are significantly more accurate in practice than simply aligning with a single default parameter choice.
unknown title
, 2015
"... GUIDANCE2: accurate detection of unreliable alignment regions accounting for the uncertainty of multiple parameters ..."
Abstract
- Add to MetaCart
(Show Context)
GUIDANCE2: accurate detection of unreliable alignment regions accounting for the uncertainty of multiple parameters
unknown title
, 2015
"... GUIDANCE2: accurate detection of unreliable alignment regions accounting for the uncertainty of multiple parameters ..."
Abstract
- Add to MetaCart
(Show Context)
GUIDANCE2: accurate detection of unreliable alignment regions accounting for the uncertainty of multiple parameters