Results

**1 - 2**of**2**### Ensemble Multiple Sequence Alignment via Advising

"... The multiple sequence alignments computed by an aligner for different settings of its parameters, as well as the align-ments computed by different aligners using their default set-tings, can differ markedly in accuracy. Parameter advising is the task of choosing a parameter setting for an aligner to ..."

Abstract
- Add to MetaCart

(Show Context)
The multiple sequence alignments computed by an aligner for different settings of its parameters, as well as the align-ments computed by different aligners using their default set-tings, can differ markedly in accuracy. Parameter advising is the task of choosing a parameter setting for an aligner to maximize the accuracy of the resulting alignment. We ex-tend parameter advising to aligner advising, which in con-trast chooses among a set of aligners to maximize accuracy. In the context of aligner advising, default advising selects from a set of aligners that are using their default settings, while general advising selects both the aligner and its pa-rameter setting. In this paper, we apply aligner advising for the first time, to create a true ensemble aligner. Through cross-validation experiments on benchmark protein sequence alignments, we show that parameter advising boosts an aligner’s accuracy beyond its default setting for virtually all of the standard aligners currently used in practice. Furthermore, aligner advising with a collection of aligners further improves upon parameter advising with any single aligner, though surpris-ingly the performance of default advising on testing data is actually superior to general advising due to less overfitting to training data. The new ensemble aligner that results from aligner advis-ing is significantly more accurate than the best single default aligner, especially on hard-to-align sequences. This success-fully demonstrates how to construct out of a collection of individual aligners, a more accurate ensemble aligner.

### IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 1 Learning Parameter-Advising Sets for Multiple Sequence Alignment

"... Abstract—While the multiple sequence alignment output by an aligner strongly depends on the parameter values used for the alignment scoring function (such as the choice of gap penalties and substitution scores), most users rely on the single default parameter setting provided by the aligner. A diffe ..."

Abstract
- Add to MetaCart

(Show Context)
Abstract—While the multiple sequence alignment output by an aligner strongly depends on the parameter values used for the alignment scoring function (such as the choice of gap penalties and substitution scores), most users rely on the single default parameter setting provided by the aligner. A different parameter setting, however, might yield a much higher-quality alignment for the specific set of input sequences. The problem of picking a good choice of parameter values for specific input sequences is called parameter advising. A parameter advisor has two ingredients: (i) a set of parameter choices to select from, and (ii) an estimator that provides an estimate of the accuracy of the alignment computed by the aligner using a parameter choice. The parameter advisor picks the parameter choice from the set whose resulting alignment has highest estimated accuracy. We consider for the first time the problem of learning the optimal set of parameter choices for a parameter advisor that uses a given accuracy estimator. The optimal set is one that maximizes the expected true accuracy of the resulting parameter advisor, averaged over a collection of training data. While we prove that learning an optimal set for an advisor is NP-complete, we show there is a natural approximation algorithm for this problem, and prove a tight bound on its approximation ratio. Experiments with an implementation of this approximation algorithm on biological benchmarks, using various accuracy estimators from the literature, show it finds sets for advisors that are surprisingly close to optimal. Furthermore, the resulting parameter advisors are significantly more accurate in practice than simply aligning with a single default parameter choice. Index Terms—Multiple sequence alignment, alignment scoring functions, parameter values, accuracy estimation, parameter advising. F 1