| K. Sjolander, K. Karplus, M. Brown, R. Hughey, A. Krogh, I. Mian and D. Haussler, "Dirichlet mixtures: A method for improving detection of weak but significant protein sequence homology," Computer Applications in the Biosciences 12 (1996). |
....be those that are detected by multiple models. Future and current work include testing our results on proteins that have a CxxS motif rather than CxxC and using the identities of the two residues embedded in the CxxC and CxxS motifs. We are also exploring building new Dirichlet mixture priors [18] on our remapped alphabets to replace the uniform priors mentioned in Footnote 1. We are also looking at using other reduced alphabets such as those produced by applying the algorithm of Cannata et al. 4] Preliminary jack knife tests of various alphabets produced by Cannata et al. s algorithm ....
K. Sjolander, K. Karplus, M. Brown, R. Hughey, A. Krogh, I. S. Mian, and D. Haussler. Dirichlet mixtures: A method for improving detection of weak but significant protein sequence homology. Computer Applications in the Biosciences, 12(4):327--345, 1996.
....tend to appear in groups that share similar biochemical properties such as the hydrophobic group. The presence of one amino acid in the group increases the likelihood of seeing other amino acids from the same group. A method that addresses these problems is mixtures of Dirichlet distributions [4, 28]. The expectation and the maximum aposteriori value of a (single component) random variable from the Dirichlet distribution can be viewed as a smoothing process with a pseudo count. A mixture of Dirichlet distributions can encode the grouping information by having a component in the mixture for ....
.... We perform our experiments over the same data set that was used in [23] The data consists of sets of observed counts of amino acids taken from the BLOCKS database [20, 18] The counts are weighted using a position speci c weighting scheme described in [21] with slight variations presented in [28]. The data set was split into disjoint training and test subsets. For each experiment, we compute equation (20) for a different size of the sample k where 0 k 5. For each sample size, we compute the excess entropy of the methods over all possible samples drawn from each column. We compare the ....
[Article contains additional citation context not shown here]
K. Sjolander, K. Karplus, M. Brown, R. Hughey, A. Krogh, I. S. Mian, and D. Haussler. Dirichlet mixtures: A method for improving detection of weak but signicant protein sequence homology. Computer Applications in the Biosciences, 12(4):327345, 1996.
....tend to appear in groups that share similar biochemical properties such as the hydrophobic group. The presence of one amino acid in the group increases the likelihood of seeing other amino acids from the same group. A method that addresses these problems is mixtures of Dirichlet distributions [4, 27]. The expectation and the maximum aposteriori value of a (singlecomponent) random variable from the Dirichlet distribution can be viewed as a smoothing process with a pseudo count. A mixture of Dirichlet distributions can encode the grouping information by having a component in the mixture for ....
.... We perform our experiments over the same data set that was used in [22] The data consists of sets of observed counts of amino acids taken from the BLOCKS database [19, 17] The counts are weighted using a position specific weighting scheme described in [20] with slight variations presented in [27]. The data set was split into disjoint training and test subsets. For each experiment, we compute equation (20) for a different size of the sample # where # # # # #. For each sample size, we compute the excess entropy of the methods over all possible samples drawn from each column. We compare the ....
[Article contains additional citation context not shown here]
K. Sjolander, K. Karplus, M. Brown, R. Hughey, A. Krogh, I. S. Mian, and D. Haussler. Dirichlet mixtures: A method for improving detection of weak but significant protein sequence homology. Computer Applications in the Biosciences, 12(4):327--345, 1996.
....i=1 p i 1 i Z (16) where Z is the normalizing constant such that integrates to unity. Letting = P 20 i=1 i , the Dirichlet density with parameters 1 ; 20 is peaked around the amino acid distribution where p i = i . The larger is, the more peaked is the density. [14] To capture several underlying distributions, a linear combination of k Dirichlet density components is used to produce a Dirichlet mixture, q 1 1 q 2 2 : q k k : 17) 10 3.4 Parameter estimation 3 THEORY The 20k alpha parameters and the 20 mixture coe cients q j are ....
Sjlander K., Karplus K., Brown M. P., Hughey R., Krogh A., Mian I. S., and Haussler D. 1996 Dirichlet mixtures: A method for improving detection of weak but signicant protein sequence homology. Computer Applications in the Biosciences 12(4):327-345.
....probabilities in an HMM. For protein models, by default, HMMER uses a nine component mixture Dirichlet prior for match emissions, and single component Dirichlet priors for insert emissions and transitions. The nine component match emission mixture Dirichlet comes from the work of Kimmen Sjolander (Sjolander et al. 1996). For DNA RNA models, by default, HMMER uses single component Dirichlets. Two example null model files, amino.pri and nucleic.pri, are provided in the Demos subdirectory of the HMMER distribution. They are copies of the internal default HMMER prior settings. The way the format of these files ....
Sjolander, K., Karplus, K., Brown, M., Hughey, R., Krogh, A., Mian, I. S., and Haussler, D. (1996). Dirichlet mixtures: A method for improving detection of weak but significant protein sequence homology. Comput. Applic. Biosci., 12:327--345.
....parameters and therefore require a large amount of training data. A typical, 200 state HMM may contain on the order of 5000 trainable parameters. Adequate training of such a model can require on the order of 200 homologous sequences [13] The use of empirically derived Dirichlet mixture priors [13, 38] can partially offset the need for larger training sets. The size of the model may be greatly reduced by focusing only upon regions that are highly conserved across family members. Usually these regions, called motifs, have been conserved by evolution for important structural or functional ....
K. Sjolander, K. Karplus, M. Brown, R. Hughey, A. Krogh, I. S. Mian, and D. Haussler. Dirichlet mixtures: A method for improving detection of weak but significant protein sequence homology. Computer Applications in the Biosciences, 1996.
....proteins in database search [6] Our hmm fold recognition method differs from protein threading methods [13, 23, 16, 17] in that pairwise interactions are not modeled or used. Instead, we employ Bayesian methods [4, 3, 21] to incorporate prior information in the form of Dirichlet mixture densities [24] over position specific amino acid distributions, and over insertion and deletion probabilities in different structural environments (Section 2.1) The priors reflect different patterns of sequence conservation, such as invariant or hydrophobic, and can be combined with data from aligned homologs ....
K. Sjolander, K. Karplus, M. P. Brown, R. Hughey, A. Krogh, I. S. Mian, and D. Haussler. Dirichlet mixtures: A method for improving detection of weak but significant protein sequence homology. CABIOS, 12(4):327--345, 1996. 10
No context found.
K. Sjolander, K. Karplus, M. P. Brown, R. Hughey, A. Krogh, I. S. Mian, and D. Haussler. Dirichlet mixtures: A method for improving detection of weak but significant protein sequence homology. to appear in CABIOS, 1996.
No context found.
K. Sjolander, K. Karplus, M. Brown, R. Hughey, A. Krogh, I. Mian and D. Haussler, "Dirichlet mixtures: A method for improving detection of weak but significant protein sequence homology," Computer Applications in the Biosciences 12 (1996).
No context found.
K Sjolander, K Karplus, M Brown, R Hughey, A Krogh, IS Mian, and D Haussler, "Dirichlet mixtures: A method for improving detection of weak but significant protein sequence homology," CABIOS, vol. 12, no. 4, pp. 327--45, Aug 1996.
No context found.
Sjolander, K., Karplus, K., Brown, M., Hughey, R., Krogh, A., Mian, I. S., & Haussler, D. (1996). Dirichlet mixtures: A method for improving detection of weak but significant protein sequence homology. Computing Applications in the Biosciences, 12, 327--345.
No context found.
Sjolander, K., Karplus, K., Brown, M., Hughey, R., Krogh, A., Mian, I. S., & Haussler, D. (1996). Dirichlet mixtures: A method for improving detection of weak but significant protein sequence homology. Computing Applications in the Biosciences, 12, 327--345.
No context found.
K Sjolander, K Karplus, M Brown, R Hughey, A Krogh, IS Mian, and D Haussler, "Dirichlet mixtures: A method for improving detection of weak but significant protein sequence homology," CABIOS, vol. 12, no. 4, pp. 327--45, Aug 1996.
No context found.
K. Sjolander, K. Karplus, M. Brown, R. Hughey, A. Krogh, I. Mian and D. Haussler, "Dirichlet mixtures: A method for improving detection of weak but significant protein sequence homology," Computer Applications in the Biosciences 12 (1996).
No context found.
K. Sjolander, K. Karplus, M. Brown, R. Hughey, A. Krogh, I.S. Mian, and D. Haussler. Dirichlet mixtures: A method for improving detection of weak but significant protein sequence homology. Computer Applications in the Biosciences, 12, 1996.
No context found.
Sjlander,K., Karplus,K., Brown,M., Hughey,R., Krogh,A., Mian,I.S. and Haussler,D. (1996) Dirichlet mixtures: A method for improving detection of weak but significant protein sequence homology. Comput. Applic. Biosci., 12, 327--345.
No context found.
Sjlander,K., Karplus,K., Brown,M.P., Hughey,R., Krogh,A., Mian,I.S. and Haussler,D. (1996). Dirichlet mixtures: A method for improving detection of weak but significant protein sequence homology. Comput. Applic. Biosci., 12, 327--345.
No context found.
Sjolander, K. Karplus, M. Brown, R. Hughey, A. Krogh, I. S. Mian and D. Haussler, "Dirichlet Mixtures: A method for improving detection of weak but significant protein sequence homology" CABIOS 12, 327 (1996).
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC