Results 1 - 10
of
32
WordNet: A Lexical Database for English
- Communications of the ACM
, 1995
"... Because meaningful sentences are composed of meaningful words, any system that hopes to process natural languages as people do must have information about words and their meanings. This information is traditionally provided through dictionaries, and machine-readable dictionaries are now widely avail ..."
Abstract
-
Cited by 1013 (0 self)
- Add to MetaCart
Because meaningful sentences are composed of meaningful words, any system that hopes to process natural languages as people do must have information about words and their meanings. This information is traditionally provided through dictionaries, and machine-readable dictionaries are now widely available. But dictionary entries evolved for the convenience of human readers, not for machines. WordNet 1 provides a more effective combination of traditional lexicographic information and modern computing. WordNet is an online lexical database designed for use under program control. English nouns, verbs, adjectives, and adverbs are organized into sets of synonyms, each representing a lexicalized concept. Semantic relations link the synonym sets [4]. Language Definitions We define the vocabulary of a language as a set W of pairs (f,s), where a form
Large-scale dictionary construction for foreign language tutoring and interlingual machine translation
- MACHINE TRANSLATION
, 1997
"... This paper describes techniques for automatic construction of dictionaries for use in large-scale foreign language tutoring (FLT) and interlingual machine translation (MT) systems. The dictionaries are based on a language-independent representation called lexical conceptual structure (LCS). A primar ..."
Abstract
-
Cited by 71 (9 self)
- Add to MetaCart
This paper describes techniques for automatic construction of dictionaries for use in large-scale foreign language tutoring (FLT) and interlingual machine translation (MT) systems. The dictionaries are based on a language-independent representation called lexical conceptual structure (LCS). A primary goal of the LCS research is to demonstrate that synonymous verb senses share distributional patterns. In this paper, we show how the syntax-semantics relation can be used to develop a lexical acquisition approach that contributes both toward the enrichment of existing online resources and toward the development of lexicons containing more complete information than is provided in any of these resources alone. We start by describing the structure of the LCS and showing how this representation is used in FLT and MT. We then focus on the problem of building LCS dictionaries for large-scale FLT and MT. First, we describe authoring tools for manual and semi-automatic construction of LCS dictionaries; we then present a more sophisticated approach that uses linguistic techniques for building word defmitions automatically. These techniques have been implemented as part of a set of lexicon-development tools used in the MILT FLT project (Dorr et al., 1995; Sams, 1995; Weinberg et al., 1995) and in the PRINCITRAN MT project (Dorr et al., 1995b).
The faculty of language: what’s special about it?
- Cognition
, 2005
"... We examine the question of which aspects of language are uniquely human and uniquely linguistic in light of recent arguments by Hauser, Chomsky, and Fitch that the only such aspect is syntactic recursion, the rest of language being either specific to humans but not to language (e.g., words and conce ..."
Abstract
-
Cited by 34 (4 self)
- Add to MetaCart
We examine the question of which aspects of language are uniquely human and uniquely linguistic in light of recent arguments by Hauser, Chomsky, and Fitch that the only such aspect is syntactic recursion, the rest of language being either specific to humans but not to language (e.g., words and concepts) or not specific to humans (e.g., speech perception). We find this argument problematic. It ignores the many aspects of grammar that are not recursive, such as phonology, morphology, case, and agreement. It is inconsistent with the anatomy and neural control of the human vocal tract. And it is weakened by experiments showing that speech perception cannot be reduced to primate audition, that word learning cannot be reduced to fact learning, and that at least one gene involved in speech and language was evolutionarily selected in the human lineage but is not specific to recursion. The recursion-only claim, we suggest, is motivated by Chomsky’s recent approach to syntax, the Minimalist Program, which de-emphasizes the same aspects of language. The approach, however, is sufficiently problematic that it cannot be used to support claims about evolution. We contest other arguments from Chomsky that language is not an adaptation, namely that it is “perfect, ” nonredundant, unusable in any partial form, and badly designed for communication. The hypothesis that language is a complex adaptation for communication which evolved piecemeal avoids all these problems.
Degraded Text Recognition Using Visual And Linguistic Context
, 1995
"... Recognition of degraded text is a challenging problem. To improve the performance of an OCR system on degraded images of text, postprocessing techniques are critical. The objective of postprocessing is to correct errors or to resolve ambiguities in OCR results by using contextual information. Depend ..."
Abstract
-
Cited by 20 (2 self)
- Add to MetaCart
Recognition of degraded text is a challenging problem. To improve the performance of an OCR system on degraded images of text, postprocessing techniques are critical. The objective of postprocessing is to correct errors or to resolve ambiguities in OCR results by using contextual information. Depending on the extent of context used, there are different levels of postprocessing. In current commercial OCR systems, word-level postprocessing methods, such as dictionary-lookup, have been applied successfully. However, many OCR errors cannot be corrected by word-level postprocessing. To overcome this limitation, passage-level postprocessing, in which global contextual information is utilized, is necessary. In most current studies on passage-level postprocessing, linguistic context is the major resource to be exploited. This thesis addresses problems in degraded text recognition and discusses potential solutions through passage-level postprocessing. The objective is to develop a postprocessin...
Analysis of a Hand-Tagging Task
- In Proceedings of ANLP-97 Workshop on Tagging Text with Lexical Semantics
, 1997
"... We analyze the results of a semantic annotation task performed by novice taggers as p of the WordNet SemCor project (Landes et al., in press). Each polysemous content word in a text was matched to a sense from WordNet. Comparing the performance of the novice taggers.with that of experienced l ..."
Abstract
-
Cited by 11 (4 self)
- Add to MetaCart
We analyze the results of a semantic annotation task performed by novice taggers as p of the WordNet SemCor project (Landes et al., in press). Each polysemous content word in a text was matched to a sense from WordNet. Comparing the performance of the novice taggers.with that of experienced lexicographers, we find that the degree of polysemy, part of speech, and the position within the WordNet entry of the target words played a role in the taggers ' choices. The taggers agreed on a sense choice more often than they agreed with two lexicographers, suggesting an effect of experience on sense distinction. Evidence indicates that taggers selecting senses from a list ordered by frequency of occurrence, where salient, core senses are found at the beginning of the entry, use a different strategy than taggers working with a randomly ordered list of senses.
Semantic Distance Effects on Object and Action Naming
"... Graded interference effects were tested in a naming task, in parallel for objects and actions. Participants named either object or action pictures presented in the context of other pictures (blocks) that were either semantically very similar, or somewhat semantically similar or semantically dissimil ..."
Abstract
-
Cited by 9 (7 self)
- Add to MetaCart
Graded interference effects were tested in a naming task, in parallel for objects and actions. Participants named either object or action pictures presented in the context of other pictures (blocks) that were either semantically very similar, or somewhat semantically similar or semantically dissimilar. We found that naming latencies for both object and action words were modulated by the semantic similarity between the exemplars in each block, providing evidence in both domains of graded semantic effects. Graded Semantic Effects in Object and Action Naming Miller and Fellbaum (1991) wrote: "When psychologists think about the organization of lexical memory it is nearly always the organization of nouns that they have in mind" (p.214). Even more specifically, we may add, often it is nouns referring to objects that we have in mind. Although the object-noun domain is certainly relevant to studies of lexical memory, it only represents part of adults' lexical knowledge; theories and tools deve...
Consistent criteria for sense distinctions
- Computers and the Humanities. Senseval Special Issue
, 2000
"... Abstract. This paper specifically addresses the question of polysemy with respect to verbs, and whether or not the sense distinctions that are made in on-line lexical resources such as WordNet are appropriate for computational lexicons. The use of sets of related syntactic frames and verb classes ar ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Abstract. This paper specifically addresses the question of polysemy with respect to verbs, and whether or not the sense distinctions that are made in on-line lexical resources such as WordNet are appropriate for computational lexicons. The use of sets of related syntactic frames and verb classes are examined as a means of simplifying the task of defining different senses, and the importance of concrete criteria such as different predicate argument structures, semantic class constraints and lexical co-occurrences is emphasized. 1.
The Breakdown of Semantic Knowledge: Insights from a Statistical Model of Meaning Representation
, 2003
"... Investigations of patients with semantic category-specific deficits have revealed a wide range of performance and variability in categories that are impaired or spared; this variability presents a challenge to accounts of category specificity. Accounts based only on impairment to semantic features o ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
Investigations of patients with semantic category-specific deficits have revealed a wide range of performance and variability in categories that are impaired or spared; this variability presents a challenge to accounts of category specificity. Accounts based only on impairment to semantic features of a particular type (e.g., visual), as well as accounts based only on featural properties (e.g., feature intercorrelations), are insufficient to explain the variability of patients' performance. A first goal of the paper is to discuss how a hybrid account incorporating both a level of organization according to feature-types (a level of nonlinguistic conceptual representations) and a level of organization dictated by featural properties may provide a more comprehensive account of the cases reported in the literature. The second and most novel goal of the study reported here is to derive from our hybrid account a series of novel predictions concerning the representation and impairment of a different domain of knowledge: knowledge of actions and events, a domain of knowledge that has received remarkably little attention to date. Keywords: category-specificity, nouns, verbs, semantics, simulation The breakdown of semantic knowledge: Insights from a statistical model of meaning representation. The study of patients in whom semantic knowledge has been disrupted has led to a number of important inferences concerning the underlying architecture of the semantic system (Warrington, 1975). Particularly relevant are cases in which focal brain damage creates categoryspecific deficits (i.e., selective impairment of semantic knowledge along category boundaries). At present there are a substantial number of cases on record (approximately 89, according to Rogers & Plaut, 2002). Specificity in...
Spanish EuroWordNet and LCS-Based Interlingual MT
- IN PROCEEDINGS OF THE MT SUMMIT WORKSHOP ON INTERLINGUAS IN MT
, 1997
"... We present a machine translation framework in which the interlingua -- Lexical Conceptual Structure (LCS) -- is coupled with a definitional component that includes bilingual (EuroWordNet) links between words in the source and target languages. While the links between individual words are languag ..."
Abstract
-
Cited by 6 (4 self)
- Add to MetaCart
We present a machine translation framework in which the interlingua -- Lexical Conceptual Structure (LCS) -- is coupled with a definitional component that includes bilingual (EuroWordNet) links between words in the source and target languages. While the links between individual words are language-specific, the LCS is designed to be a language-independent, compositional representation. We take the view that the two types of information -- shallower, transfer-like knowledge as well as deeper, compositional knowledge -- can be reconciled in interlingual machine translation, the former for overcoming the intractability of LCS-based lexical selection, and the latter for relating the underlying semantics of two words cross-linguistically. We describe the acquisition process for these two information types and present results of hand-verification of the acquired lexicon. Finally, we demonstrate the utility of the two information types in interlingual MT.
Activating meaning in time: The role of imageability and form-class
- Language and Cognitive Processes
, 2002
"... A number of studies have shown that the meanings of spoken words are activated early in processing, well before all of the word has been heard. However, these studies have not explicitly taken into account a number of variables which are known to affect word recognition processes. Two important vari ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
A number of studies have shown that the meanings of spoken words are activated early in processing, well before all of the word has been heard. However, these studies have not explicitly taken into account a number of variables which are known to affect word recognition processes. Two important variables are a word’s imageability and its form-class. In the experiments reported here we use a cross-modal priming task to investigate the role that these variables play on the time-course with which word meanings are activated. We present visual target words for lexical decision at different points through the duration of spoken primes. In one study the spoken primes were either abstract or concrete words, and in a second they were either nouns or verbs. We found signi�cant priming for all types of words early in the duration of a spoken prime. We discuss these results in terms of various models of semantic activation, concluding that distributed models provide the best �t to the data.

