| Printz, H. Fast computation of maximum entropy/minimum divergence model feature gain. In Proceedings of the Fifth International Conference on Spoken Language Processing (November 1998). |
....is the rst application of MEMD to building a large scale translation model, and one of the few direct comparisons between a MEMD model and an almost exactly equivalent linear model. I also compare several di erent techniques for MEMD feature selection, including a new algorithm due to Printz [103]. Rosenfeld [106] reports a greater perplexity reduction (23 versus 10 ) over a baseline trigram language model due to long distance word pair predictions in a maximum entropy (ME) framework 47 In this section I describe the two models to be compared. The linear model I used as a baseline is ....
....this method requires many expensive passes over the corpus to optimize the weights for the set of features under consideration at each step, and it adds only one feature per step, so it is not practical for evaluating feature sets containing thousands of features or more. In a recent paper [103], Printz argues that it is usually sucient to perform the iteration described in the previous paragraph only once, in other words that features can be ranked simply according to their gain with respect to some initial model. He also gives an algorithm for computing approximate gains which requires ....
Harry Printz. Fast computation of Maximum Entropy/Minimum Divergence feature gain. In ICSLP-98 [60], pages 2083-2086.
No context found.
Printz, H. Fast computation of maximum entropy/minimum divergence model feature gain. In Proceedings of the Fifth International Conference on Spoken Language Processing (November 1998).
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC