Results 1 -
9 of
9
CoNLL-2012 shared task: Modeling multilingual unrestricted coreference in OntoNotes
- In CoNLL 2012
, 2012
"... The CoNLL-2012 shared task involved pre-dicting coreference in English, Chinese, and Arabic, using the final version, v5.0, of the OntoNotes corpus. It was a follow-on to the English-only task organized in 2011. Un-til the creation of the OntoNotes corpus, re-sources in this sub-field of language pr ..."
Abstract
-
Cited by 63 (9 self)
- Add to MetaCart
(Show Context)
The CoNLL-2012 shared task involved pre-dicting coreference in English, Chinese, and Arabic, using the final version, v5.0, of the OntoNotes corpus. It was a follow-on to the English-only task organized in 2011. Un-til the creation of the OntoNotes corpus, re-sources in this sub-field of language process-ing were limited to noun phrase coreference, often on a restricted set of entities, such as the ACE entities. OntoNotes provides a large-scale corpus of general anaphoric coreference not restricted to noun phrases or to a spec-ified set of entity types, and covers multi-ple languages. OntoNotes also provides ad-ditional layers of integrated annotation, cap-turing additional shallow semantic structure. This paper describes the OntoNotes annota-tion (coreference and other layers) and then describes the parameters of the shared task in-cluding the format, pre-processing informa-tion, evaluation criteria, and presents and dis-cusses the results achieved by the participat-ing systems. The task of coreference has had a complex evaluation history. Potentially many evaluation conditions, have, in the past, made it difficult to judge the improvement in new algorithms over previously reported re-sults. Having a standard test set and stan-dard evaluation parameters, all based on a re-source that provides multiple integrated anno-tation layers (syntactic parses, semantic roles, word senses, named entities and coreference) and in multiple languages could support joint modeling and help ground and energize on-going research in the task of entity and event coreference. 1
Towards Robust Linguistic Analysis Using OntoNotes
"... Large-scale linguistically annotated corpora have played a crucial role in advancing the state of the art of key natural language technologies such as syntactic, semantic and discourse analyzers, and they serve as training data as well as evaluation benchmarks. Up till now, however, most of the eval ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
(Show Context)
Large-scale linguistically annotated corpora have played a crucial role in advancing the state of the art of key natural language technologies such as syntactic, semantic and discourse analyzers, and they serve as training data as well as evaluation benchmarks. Up till now, however, most of the evaluation has been done on monolithic corpora such as the Penn Treebank, the Proposition Bank. As a result, it is still unclear how the state-of-the-art analyzers perform in general on data from a variety of genres or domains. The completion of the OntoNotes corpus, a large-scale, multi-genre, multilingual corpus manually annotated with syntactic, semantic and discourse information, makes it possible to perform such an evaluation. This paper presents an analysis of the performance of publicly available, state-of-the-art tools on all layers and languages in the OntoNotes v5.0 corpus. This should set the benchmark for future development of various NLP components in syntax and semantics, and possibly encourage research towards an integrated system that makes use of the various layers jointly to improve overall performance. 1
Dependency-based PropBanking of clinical Finnish
"... In this paper, we present a PropBank of clinical Finnish, an annotated corpus of verbal propositions and arguments. The clinical PropBank is created on top of a previously existing dependency treebank annotated in the Stanford Dependency (SD) scheme and covers 90 % of all verb occurrences in the tre ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
(Show Context)
In this paper, we present a PropBank of clinical Finnish, an annotated corpus of verbal propositions and arguments. The clinical PropBank is created on top of a previously existing dependency treebank annotated in the Stanford Dependency (SD) scheme and covers 90 % of all verb occurrences in the treebank. We establish that the PropBank scheme is applicable to clinical Finnish as well as compatible with the SD scheme, with an overwhelming proportion of arguments being governed by the verb. This allows argument candidates to be restricted to direct verb dependents, substantially simplifying the PropBank construction. The clinical Finnish PropBank is freely available at the address
A Pilot PropBank Annotation for Quranic Arabic
"... The Quran is a significant religious text written in a unique literary style, close to very poetic language in nature. Accordingly it is significantly richer and more complex than the newswire style used in the previously released Arabic PropBank (Zaghouani et al., 2010; Diab et al., 2008). We prese ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
The Quran is a significant religious text written in a unique literary style, close to very poetic language in nature. Accordingly it is significantly richer and more complex than the newswire style used in the previously released Arabic PropBank (Zaghouani et al., 2010; Diab et al., 2008). We present preliminary work on the creation of a unique Arabic proposition repository for Quranic Arabic. We annotate the semantic roles for the 50 most frequent verbs in the Quranic Arabic Dependency Treebank (QATB) (Dukes and Buckwalter 2010). The Quranic Arabic PropBank (QAPB) will be a unique new resource of its kind for the Arabic NLP research community as it will allow for interesting insights into the semantic use of classical Arabic, poetic literary Arabic, as well as significant religious texts. Moreover, on a pragmatic level QAPB will add approximately 810 new verbs to the existing Arabic PropBank (APB). In this pilot experiment, we leverage our knowledge and experience from our involvement in the APB project. All the QAPB annotations will be made freely available for research purposes. 1
Towards a dependency-based PropBank of general Finnish
- In Proceedings of the 19th Nordic Conference on Computational Linguistics (NoDaLiDa’13
, 2013
"... ABSTRACT In this work, we present the first results of a project aiming at a Finnish Proposition Bank, an annotated corpus of semantic roles. The annotation is based on an existing treebank of Finnish, the Turku Dependency Treebank, annotated using the well-known Stanford Dependency scheme. We desc ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
ABSTRACT In this work, we present the first results of a project aiming at a Finnish Proposition Bank, an annotated corpus of semantic roles. The annotation is based on an existing treebank of Finnish, the Turku Dependency Treebank, annotated using the well-known Stanford Dependency scheme. We describe the use of the dependency treebank for PropBanking purposes and show that both annotation layers present in the treebank are highly useful for the annotation of semantic roles. We also discuss the specific features of Finnish influencing the development of a PropBank as well as the methods employed in the annotation, and finally, we present preliminary evaluation of the annotation quality.
Revisiting Arabic Semantic Role Labeling using SVM Kernel Methods
"... As a critical language, there is huge potential for the usefulness of an Arabic Semantic Role Labeling (SRL) system. This task involves two subtasks: predicate argument boundary detection and argument classification. Based on the innovations of Diab, Moschitti, and Pighin (2007) in the field of Arab ..."
Abstract
- Add to MetaCart
As a critical language, there is huge potential for the usefulness of an Arabic Semantic Role Labeling (SRL) system. This task involves two subtasks: predicate argument boundary detection and argument classification. Based on the innovations of Diab, Moschitti, and Pighin (2007) in the field of Arabic Natural Language Processing (NLP), SRL in particular, we are currently developing a system for automatic SRL in Arabic.
Building a Lexical Semantic Resource for Arabic Morphological Patterns
"... Abstract—We present a pilot study for building an Arabic morphological Pattern Net as a lexical semantic resource. In this paper, a limited number of Arabic Morphological Patterns have been selected in order to analyze the structure and behavior of some verbs in the Arabic PropBank [19]. Our goal is ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract—We present a pilot study for building an Arabic morphological Pattern Net as a lexical semantic resource. In this paper, a limited number of Arabic Morphological Patterns have been selected in order to analyze the structure and behavior of some verbs in the Arabic PropBank [19]. Our goal is to study whether there is a direct relationship between morphological patterns and verbal semantic roles. The approach to building our morphological patterns ’ database is based on linguistic generalization of the semantic roles of the verbal predicates. The results obtained show the feasibility of a more comprehensive future study.
Towards a Dependency-based PropBank of General Finnish
"... In this work, we present the first results of a project aiming at a Finnish Proposition Bank, an annotated corpus of semantic roles. The annotation is based on an existing treebank of Finnish, the Turku Dependency Treebank, annotated using the well-known Stanford Dependency scheme. We describe the u ..."
Abstract
- Add to MetaCart
(Show Context)
In this work, we present the first results of a project aiming at a Finnish Proposition Bank, an annotated corpus of semantic roles. The annotation is based on an existing treebank of Finnish, the Turku Dependency Treebank, annotated using the well-known Stanford Dependency scheme. We describe the use of the dependency treebank for PropBanking purposes and show that both annotation layers present in the treebank are highly useful for the annotation of semantic roles. We also discuss the specific features of Finnish influencing the development of a PropBank as well as the methods employed in the annotation, and finally, we present preliminary evaluation of the annotation quality.