MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Efficient reconstruction of haplotype structure via perfect phylogeny (2003) [40 citations — 8 self]

Download:
Download as a PDF | Download as a PS
by Eleazar Eskin, Eleazar Eskin, Eran Halperin, Eran Halperin, Richard M. Karp, Richard M. Karp
Journal of Bioinformatics and Computational Biology
http://digitalassets.lib.berkeley.edu/techreports/ucb/text/CSD-02-1196.ps
Add To MetaCart

Abstract:

Each person’s genome contains two copies of each chromosome, one inherited from the father and the other from the mother. A person’s genotype specifies the pair of bases at each site, but does not specify which base occurs on which chromosome. The sequence of each chromosome separately is called a haplotype. The determination of the haplotypes within a population is essential for understanding genetic variation and the inheritance of complex diseases. The haplotype mapping project, a successor to the human genome project, seeks to determine the common haplotypes in the human population. Since experimental determination of a person’s genotype is less expensive than determining its component haplotypes, algorithms are required for computing haplotypes from genotypes. Two observations aid in this process: first, the human genome contains short blocks within which only a few different haplotypes occur; second, as suggested by Gusfield, it is reasonable to assume that the haplotypes observed within a block have evolved according to a perfect phylogeny, in which at most one mutation event has occurred at any site. We present a simple and efficient polynomial-time algorithm for inferring haplotypes from the genotypes of a set of individuals assuming a perfect phylogeny. Using a reduction to 2-SAT we extend this algorithm to handle constraints that apply when we have genotypes from both parents and child. We also present a hardness result for the problem of removing the minimum number of individuals from a population to ensure that the genotypes of the remaining individuals are consistent with a perfect phylogeny. Our algorithms have been tested on real data and give biologically meaningful results. 1

Citations

512 Optimization, Approximation, and Complexity Classes – Papadimitriou, Yannakakis - 1991
511 The complexity of theorem-proving procedures – COOK
200 On the complexity of timetable and multicommodity flow problems – Even, Itai, et al. - 1976
143 A new statistical method for haplotype reconstruction from population data – Stephens, Smith, et al. - 2000
114 High resolution haplotype structure in the human genome – Daly, Rioux, et al. - 2001
112 Approximate max-flow min-(multi)cut theorems and their applications – Garg, Vazirani, et al.
93 Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population. Molecular Biology and Evolution – Excoffier, Slatkin - 1995
82 Inference of haplotypes from PCR-amplified samples of diploid populations – Clark - 1990
79 On selecting a satisfying truth assignment – Papadimitriou - 1991
79 Efficient Algorithms for Inferring Evolutionary Trees – Gusfield - 1991
71 Haplotyping as perfect phylogeny: conceptual framework and efficient solutions – Gusfield
59 Blocks of limited haplotype diversity revealed by high-resolution scanning of human chromosome 21 – Patil, Berno, et al.
45 Inference of haplotypes from samples of diploid populations: complexity and algorithms – Gusfield
38 Haplotyping as perfect phylogeny: A direct approach – Bafna, Gusfield, et al. - 2003
33 A linear time algorithm for testing the truth of certain quantified boolean formulas – Aspvall, Plass, et al. - 1979
30 Haplo: a program using the em algorithm to estimate the frequencies of multi-site haplotypes – Hawley, Kidd - 1995
30 An e-m algorithm and testing strategy for multiple-locus haplotypes – Long, Williams, et al. - 1995
19 Accuracy of haplotype frequency estimation for biallelic loci, via the expectation-maximization algorithm for unphased diploid genotype data – Schork
16 An algorithm for determining whether a given binary matroid is graphic – Tutte - 1960
14 A practical algorithm for optimal inference of haplotypes from diploid populations – Gusfield - 2000
8 Snps problems, algorithms and complexity, european symposium on algorithms – Lancia, Bafna, et al. - 2001
8 approximation and complexity classes – Optimization - 1991
6 An almost linear time algorithm for graph realization – Bixby, Wagner - 1988
4 A linear time algorithm for testing the truth of certain quantified boolean formulas – Aspval, Tarjan - 1979
3 Large scale recovery of haplotypes from genotype data using imperfect phylogeny – Eskin, Halperin, et al. - 2003
2 A Itai, and A Shamir. On the complexity of timetable and multicommodity flow problems. SICOMP – Even - 1976
2 A practical solution for haplotype mapping. Unpublished Manuscript, 2002. [16] ME Hawley and KK Kidd. Haplo: a program using the em algorithm to estimate the frequencies of multi-site haplotypes – Halperin, Eskin - 1995