Results 1 -
6 of
6
Reusing an Ontology to Generate Numeral Classifiers
- in 18th International Conference on Computational Linguistics: COLING-2000
, 2000
"... In this paper, we present a solution to the problem of generating Japanese numeral classifiers using semantic classes from an ontology. Most nouns must take a numeral classifier wheu they are quantitled in languages such as Chinese, Japanese, Korean, Malay and Thai. In order to select an appropriate ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
In this paper, we present a solution to the problem of generating Japanese numeral classifiers using semantic classes from an ontology. Most nouns must take a numeral classifier wheu they are quantitled in languages such as Chinese, Japanese, Korean, Malay and Thai. In order to select an appropriate classifier, we propose an algorithm which as- sociates classifiers with semantic classes and uses inheritance to list only those classifiers which have to be listed. It generates sortal classifiers with au accm 'acy of 81%. We rouse the ontology provided by Goi-Taikei -- a Japanese lexicon, and show that it is a reasonable choice for this task, requiring infor- mation to be entered for less than 6% of individual notInS.
2002. Using an ontology to determine English countability
- In Proc. of 19th International Conference on Computational Linguistics
"... In this paper we show to what degree the countability of English nouns is predictable from their semantics. We found that at 78 % of nouns’ countability could be predicted using an ontology of 2,710 nodes. We also show how this predictability can be used to aid non-native speakers to determine the c ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
In this paper we show to what degree the countability of English nouns is predictable from their semantics. We found that at 78 % of nouns’ countability could be predicted using an ontology of 2,710 nodes. We also show how this predictability can be used to aid non-native speakers to determine the countability of English nouns when building a bilingual machine translation lexicon. 1
Multilingual Generation of Numeral Classifiers using a Common Ontology
- In Proceedings of the 19th International Conference on Computer Processing of Oriental Languages
, 2001
"... In this paper, we present a solution to the problem of generating both Japanese and Korean numeral classifiers using semantic classes from an ontology. Most nouns must use a numeral classifier when they are quantified in languages such as Chinese, Japanese, Korean, Malay and Thai. In order to sel ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
In this paper, we present a solution to the problem of generating both Japanese and Korean numeral classifiers using semantic classes from an ontology. Most nouns must use a numeral classifier when they are quantified in languages such as Chinese, Japanese, Korean, Malay and Thai. In order to select an appropriate classifier, we propose an algorithm which associates classifiers with semantic classes and uses inheritance to list only exceptional classifiers with individual nouns. The algorithm generates sortal classifiers with an accuracy of 81%. We reuse the ontology provided by Goi-Taikei --- a Japanese lexicon, and show that it is a reasonable choice for this task, requiring information to be entered for less than 6% of individual nouns. Keywords: multilingual generation, numeral classifiers, Japanese, Korean 1
Learning Count Classifier Preferences of Malay Nouns
"... We develop a data set of Malay lexemes labelled with count classifiers, that are attested in raw or lemmatised corpora. A maximum entropy classifier based on simple, languageinspecific features generated from context tokens achieves about 50 % F-score, or about 65 % precision when a suite of binary ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
We develop a data set of Malay lexemes labelled with count classifiers, that are attested in raw or lemmatised corpora. A maximum entropy classifier based on simple, languageinspecific features generated from context tokens achieves about 50 % F-score, or about 65 % precision when a suite of binary classifiers is built to aid multi-class prediction of headword nouns. Surprisingly, numeric features are not observed to aid classification. This system represents a useful step for semisupervised lexicography across a range of languages. 1
Anchoring floating quantifiers in Japanese-to-English machine translation
- In 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics: COLING/ACL-98
, 1998
"... In this paper we present an algorithm to an-chor floating quantifiers in Japanese, a language in which quantificational nouns and numeral-classifier combinations can appear separated from the noun phrase they quantify. The algo-rithm differentiates degree and event modifiers from nouns that quantify ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
In this paper we present an algorithm to an-chor floating quantifiers in Japanese, a language in which quantificational nouns and numeral-classifier combinations can appear separated from the noun phrase they quantify. The algo-rithm differentiates degree and event modifiers from nouns that quantify noun phrases. It then finds a suitable anchor for such floating quan-tifiers. To do this, the algorithm considers the part of speech of the quantifier and the target, the semantic relation between them, the case marker of the antecedent and the meaning of the verb that governs the two constituents. The al-gorithm has been implemented and tested in a rule-based Japanese-to-English machine trans-lation system, with an accuracy of 76 % and a recall of 97%. 1
Web and Corpus Methods for Malay Count Classifier Prediction
"... We examine the capacity of Web and corpus frequency methods to predict preferred count classifiers for nouns in Malay. The observed F-score for the Web model of 0.671 considerably outperformed corpus-based frequency and machine learning models. We expect that this is a fruitful extension for Web–as– ..."
Abstract
- Add to MetaCart
We examine the capacity of Web and corpus frequency methods to predict preferred count classifiers for nouns in Malay. The observed F-score for the Web model of 0.671 considerably outperformed corpus-based frequency and machine learning models. We expect that this is a fruitful extension for Web–as–corpus approaches to lexicons in languages other than English, but further research is required in other South-East and East Asian languages. 1

