8 citations found. Retrieving documents...
J.-S. Chang and K.-Y. Su. An Unsupervised Iterative Method for Chinese New Lexicon Extraction. International Journal of Computational Linguistics & Chinese Language Processing, 1997.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
The Sparse Data Problem in Statistical Language Modeling and.. - Peng   (Correct)

....meaningful patterns. Unfortunately, segmenting an input sentence into words is a nontrivial task. There has been a signi cant amount of research on techniques for discovering word segmentation boundaries; see for example [BC96, Bre99, DB95, MA97, CAS98, dM95, KW99, Kit00, ZGCL00, Hua00, AL00, CS97, GPS99, PC96, Jin92, SS90, SSGC96] among which there are at least two Ph.D. theses [Kit00, dM95] The main idea behind most of these techniques is to start with a lexicon that contains the set of possible words and then segment a concatenated character string by optimizing a heuristic objective ....

J.-S. Chang and K.-Y. Su. An Unsupervised Iterative Method for Chinese New Lexicon Extraction. International Journal of Computational Linguistics & Chinese Language Processing, 1997.


Automatic Multi-Lingual Information Extraction - Peng (2001)   (Correct)

....to build a lexicon. Unfortunately, since there are over 20,000 Chinese characters, among which 6763 are most commonly used, building a complete lexicon by hand is impractical. Therefore a number of unsupervised segmentation methods have been proposed recently to segment Chinese and Japanese text [1, 15, 40, 74, 47]. Most of these approaches use some form of EM to learn a probabilistic model of character sequences and then employ Viterbi decoding like procedures to segment new text into words. One reason that EM algorithm is widely adopted for unsupervised training is that it is guaranteed to converge to a ....

Chang, J.-S. and Su, K.-Y.; An Unsupervised Iterative Method for Chinese New Lexicon Extraction. International Journal of Computational Linguistics & Chinese Language Processing, 1997.


Self-supervised Chinese Word Segmentation - Peng, Schuurmans (2001)   (1 citation)  (Correct)

....to build a lexicon. Unfortunately, since there are over 20,000 Chinese characters, among which 6763 are most commonly used, building a complete lexicon by hand is impractical. Therefore a number of unsupervised segmentation methods have been proposed recently to segment Chinese and Japanese text [1, 3, 8, 12, 9]. Most of these approaches use some form of EM to learn a probabilistic model of character sequences and then employ Viterbi decoding like procedures to segment new text into words. One reason that EM algorithm is widely adopted for unsupervised training is that it is guaranteed to converge to a ....

....standard EM segmentation can be thought of as a zero order HMM. Mutual Information Lexicon Optimization: Other researchers have considered using mutual information to build a lexicon. For example, 14] uses mutual information to build a lexicon, but only deals with words of up to 2 characters. [3, 12] uses mutual information and context information to build a lexicon based on the statistics directly obtained from the training corpus. By contrast, we are using mutual information to prune a given lexicon. That is, instead of building a lexicon from scratch, we rst add all possible words and ....

Chang, J.-S. and Su, K.-Y.; An Unsupervised Iterative Method for Chinese New Lexicon Extraction. International Journal of Computational Linguistics & Chinese Language Processing, 1997.


A Multivariate Gaussian Mixture Model for Automatic Compound.. - Chang, Su   Self-citation (Jing-shin Su)   (Correct)

No context found.

Chang, Jing-Shin and Keh-Yih Su, 1997a. "An Unsupervised Iterative Method for Chinese New Lexicon Extraction", to appear in International Journal of Computational Linguistics & Chinese Language Processing, 1997.


Using Self-Supervised Word Segmentation in Chinese .. - Peng, Huang.. (2001)   (Correct)

No context found.

J.-S. Chang and K.-Y. Su. An Unsupervised Iterative Method for Chinese New Lexicon Extraction. International Journal of Computational Linguistics & Chinese Language Processing, 1997.


Applying Machine Learning to Text Segmentation for .. - Huang, Peng.. (2002)   (2 citations)  (Correct)

No context found.

J.-S. Chang and K.-Y. Su. An Unsupervised Iterative Method for Chinese New Lexicon Extraction. International Journal of Computational Linguistics & Chinese Language Processing, 1997.


Waterloo at NTCIR-3: Using Self-supervised Word.. - Huang, Peng, Schuurmans, ..   (Correct)

No context found.

J.-S. Chang and K.-Y. Su. An Unsupervised Iterative Method for Chinese New Lexicon Extraction. International Journal of Computational Linguistics & Chinese Language Processing, 1997.


Waterloo at NTCIR-3: Using Self-supervised Word.. - Huang, Peng, Schuurmans, ..   (Correct)

No context found.

J.-S. Chang and K.-Y. Su. An Unsupervised Iterative Method for Chinese New Lexicon Extraction. International Journal of Computational Linguistics & Chinese Language Processing, 1997.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC