@MISC{_1methods, author = {}, title = {1 Methods 1.1 Haplotype Frequency Estimation}, year = {} }

Share

OpenURL

Abstract

The MM principle involves two steps. In maximization, we first minorize and then maximize. In minimization, we first majorize and then minimize. Because we are interested in penalized maximum likelihood estimation, we interpret MM in the former sense. The minorization step creates a surrogate function q 7 → g(q |qn) anchored at the current iterate qn of a parameter search. The surrogate function falls below the objective function f(q) and is tangent to it at the current point qn. Formally, these conditions amount to f(qn) = g(qn | qn) f(q) ≥ g(q | qn), q 6 = qn. The maximization step of the MM algorithm solves for the parameter vector qn+1 maximizing the surrogate function g(q | qn). In the process the objective function f(q) is sent uphill. The traditional EM algorithm for haplotype frequency estimation can be interpreted as an MM algorithm. Let q be the vector of haplotype frequencies and Hi be the set of ordered haplotype pairs (k, l) consistent with subject i’s observed multilocus genotype. In this notation i’s likelihood is written as `i(q) = (k,l)∈Hi qkql. The full loglikelihood across all independent samples equals L(q) = i ln `i(q). To encourage parsimony, we subtract a penalty that tends to eliminate haplotypes with low explanatory power. The penalty is defined by a threshold δ, a tuning constant λ scaling the strength of the penalty, and the penalty function p(q) = q q ≤ δ δ q> δ. 1 Accordingly, the haplotype frequency vector q is estimated by maximizing the objective function f(q) = L(q) − λ j p(qj). (1) The concavity of the natural logarithm entails the minorization