Results 1 - 10
of
32
Speech Recognition Grammar Compilation in Grammatical Framework
- In Proceedings of the Workshop on Grammar-Based Approaches to Spoken Language Processing
, 2007
"... This paper describes how grammar-based language models for speech recognition systems can be generated from Grammatical Framework (GF) grammars. Context-free grammars and finite-state models can be generated in several formats: GSL, SRGS, JSGF, and HTK SLF. In addition, semantic interpretation code ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
This paper describes how grammar-based language models for speech recognition systems can be generated from Grammatical Framework (GF) grammars. Context-free grammars and finite-state models can be generated in several formats: GSL, SRGS, JSGF, and HTK SLF. In addition, semantic interpretation code can be embedded in the generated context-free grammars. This enables rapid development of portable, multilingual and easily modifiable speech recognition applications. 1
A Multilingual Semantic Wiki Based on Attempto Controlled English and Grammatical Framework
"... Abstract. We describe a semantic wiki system with an underlying controlled natural language grammar implemented in Grammatical Framework (GF). The grammar restricts the wiki content to a well-defined subset of Attempto Controlled English (ACE), and facilitates a precise bidirectional automatic trans ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
(Show Context)
Abstract. We describe a semantic wiki system with an underlying controlled natural language grammar implemented in Grammatical Framework (GF). The grammar restricts the wiki content to a well-defined subset of Attempto Controlled English (ACE), and facilitates a precise bidirectional automatic translation between ACE and language fragments of a number of other natural languages, making the wiki content accessible multilingually. Additionally, our approach allows for automatic translation into the Web Ontology Language (OWL), which enables automatic reasoning over the wiki content. The developed wiki environment thus allows users to build, query and view OWL knowledge bases via a userfriendly multilingual natural language interface. As a further feature, the underlying multilingual grammar is integrated into the wiki and can be collaboratively edited to extend the vocabulary of the wiki or even customize its sentence structures. This work demonstrates the combination of the existing technologies of Attempto Controlled English and Grammatical Framework, and is implemented as an extension of the existing semantic wiki engine AceWiki.
Multilingual Online Generation from Semantic Web
, 2012
"... by the European Union Seventh Framework Programme under grant agreement FP7-ICT-247914. More specifically, we present work workpackage 8 (WP8): Case Study: Cultural Heritage. The objective of the work is to build an ontologybased multilingual application for museum information on the Web. Our approa ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
(Show Context)
by the European Union Seventh Framework Programme under grant agreement FP7-ICT-247914. More specifically, we present work workpackage 8 (WP8): Case Study: Cultural Heritage. The objective of the work is to build an ontologybased multilingual application for museum information on the Web. Our approach relies on the innovative idea of Reason-able View of the Web of linked data applied to the domain of cultural heritage. We have been developing a Web application that uses Semantic Web ontologies for generating coherent multilingual natural language descriptions about museum objects. We have been experimenting with museum data to test our approach and find that it performs well for the examined languages.
Typeful Ontologies with Direct Multilingual
"... Abstract. There is an exciting development in ontology description languages. However the main focus is on the knowledge representation aspect and not so much on two other aspects that are just as important in practice. First, most languages are based on some kind of untyped logic which allows to as ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
(Show Context)
Abstract. There is an exciting development in ontology description languages. However the main focus is on the knowledge representation aspect and not so much on two other aspects that are just as important in practice. First, most languages are based on some kind of untyped logic which allows to assert axioms which are not well-formed. In contrast, even the simplest database systems are equipped with some database schemas which rule out incorrect records. In the long run this helps to maintain the information consistent. Another aspect which is of interest in many ontology based systems is to have verbalization of facts and axioms in some controlled language. Although this is not something new, it is usually seen as completely separated component. From an engineering perspective, it is advantageous to use the same language for both ontology description and controlled language development. In our experiment we also realized that in many natural languages the type information i.e. the ontological classes affect the language generation. 1
Tools for multilingual grammar-based translation on the web
- In Proceedings of the Association for Computational Linguistics System Demonstrations
, 2010
"... This is a system demo for a set of tools for translating texts between multiple languages in real time with high quality. The translation works on restricted languages, and is based on semantic interlinguas. The underlying model is GF (Grammatical Framework), which is an open-source toolkit for mult ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
This is a system demo for a set of tools for translating texts between multiple languages in real time with high quality. The translation works on restricted languages, and is based on semantic interlinguas. The underlying model is GF (Grammatical Framework), which is an open-source toolkit for multilingual grammar implementations. The demo will cover up to 20 parallel languages. Two related sets of tools are presented: grammarian’s tools helping to build translators for new domains and languages, and translator’s tools helping to translate documents. The grammarian’s tools are designed to make it easy to port the technique to new applications. The translator’s tools are essential in the restricted language context, enabling the author to remain in the fragments recognized by the system. The tools that are demonstrated will be applied and developed further in the European project MOLTO (Multilingual On-Line Translation) which has started in March 2010 and runs for three years. 1 Translation Needs for the Web The best-known translation tools on the web are Google translate 1 and Systran 2. They are targeted to consumers of web documents: users who want to find out what a given document is about. For this purpose, browsing quality is sufficient, since the user has intelligence and good will, and understands that she uses the translation at her own risk. Since Google and Systran translations can be grammatically and semantically flawed, they don’t reach publication quality, and cannot hence be used by the producers of web documents. For instance, the provider of an e-commerce site cannot take the risk that the product descriptions or selling conditions have errors that change the original intentions. There are very few automatic translation systems actually in use for producers of information. As already 1 www.google.com/translate
Patent translation within the molto project
- In Workshop on Patent Translation, MT Summit XIII
, 2011
"... Abstract MOLTO is an FP7 European project whose goal is to translate texts between multiple languages in real time with high quality. Patents translation is a case of study where research is focused on simultaneously obtaining a large coverage without loosing quality in the translation. This is ach ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
(Show Context)
Abstract MOLTO is an FP7 European project whose goal is to translate texts between multiple languages in real time with high quality. Patents translation is a case of study where research is focused on simultaneously obtaining a large coverage without loosing quality in the translation. This is achieved by hybridising between a grammar-based multilingual translation system, GF, and a specialised statistical machine translation system. Moreover, both individual systems by themselves already represent a step forward in the translation of patents in the biomedical domain, for which the systems have been trained.
A Framework for Conflict Analysis of Normative Texts Written in Controlled NATURAL LANGUAGE
- JOURNAL OF LOGIC AND ALGEBRAIC PROGRAMMING
, 2013
"... In this paper we are concerned with the analysis of normative conflicts, or the detection of conflicting obligations, permissions and prohibitions in normative texts written in a Controlled Natural Language (CNL). For this we present AnaCon, a proof-of-concept system where normative texts written in ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
(Show Context)
In this paper we are concerned with the analysis of normative conflicts, or the detection of conflicting obligations, permissions and prohibitions in normative texts written in a Controlled Natural Language (CNL). For this we present AnaCon, a proof-of-concept system where normative texts written in CNL are automatically translated into the formal language CL using the Grammatical Framework (GF). Such CL expressions are then analysed for normative conflicts by the CLAN tool, which gives counter-examples in cases where conflicts are found. The framework also uses GF to give a CNL version of the counter-example, helping the user to identify the conflicts in the original text. We detail the application of AnaCon to two case studies and discuss the effectiveness of
Extracting a bilingual semantic grammar from FrameNet-annotated corpora
"... We present the creation of an English-Swedish FrameNet-based grammar in Grammatical Framework. The aim of this research is to make existing framenets computationally accessible for multilingual natural language applications via a common semantic grammar API, and to facilitate the porting of such gra ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
We present the creation of an English-Swedish FrameNet-based grammar in Grammatical Framework. The aim of this research is to make existing framenets computationally accessible for multilingual natural language applications via a common semantic grammar API, and to facilitate the porting of such grammar to other languages. In this paper, we describe the abstract syntax of the semantic grammar while focusing on its automatic extraction possibilities. We have extracted a shared abstract syntax from ~58,500 annotated sentences in Berkeley FrameNet (BFN) and ~3,500 annotated sentences in Swedish FrameNet (SweFN). The abstract syntax defines 769 frame-specific valence patterns that cover 77,8 % examples in BFN and 74,9 % in SweFN belonging to the shared set of 471 frames. As a side result, we provide a unified method for comparing semantic and syntactic valence patterns across framenets.
Computational evidence that Hindi and Urdu share a grammar but not the lexicon
- In The 3rd Workshop on South and Southeast Asian NLP, COLING
, 2012
"... Abstract Hindi and Urdu share a grammar and a basic vocabulary, but are often mutually unintelligible because they use different words in higher registers and sometimes even in quite ordinary situations. We report computational translation evidence of this unusual relationship (it differs from the ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
Abstract Hindi and Urdu share a grammar and a basic vocabulary, but are often mutually unintelligible because they use different words in higher registers and sometimes even in quite ordinary situations. We report computational translation evidence of this unusual relationship (it differs from the usual pattern, that related languages share the advanced vocabulary and differ in the basics). We took a GF resource grammar for Urdu and adapted it mechanically for Hindi, changing essentially only the script (Urdu is written in Perso-Arabic, and Hindi in Devanagari) and the lexicon where needed. In evaluation, the Urdu grammar and its Hindi twin either both correctly translated an English sentence, or failed in exactly the same grammatical way, thus confirming computationally that Hindi andUrdu share a grammar. But the evaluation also found that the Hindi and Urdu lexicons differed in 18% of the basic words, in 31% of tourist phrases, and in 92% of school mathematics terms.
Practical parsing of parallel multiple context-free grammars
- In Proceedings of TAG+11, the 11th International Workshop on Tree Adjoining Grammar and Related Formalisms
"... Abstract We discuss four previously published parsing algorithms for parallell multiple context-free grammar (PMCFG), and argue that they are similar to each other, and implement an Earley-style top-down algorithm. Starting from one of these algorithms, we derive three modifications -one bottom-up ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
Abstract We discuss four previously published parsing algorithms for parallell multiple context-free grammar (PMCFG), and argue that they are similar to each other, and implement an Earley-style top-down algorithm. Starting from one of these algorithms, we derive three modifications -one bottom-up and two variants using a left corner filter. An evaluation shows that substantial improvements can be made by using the algorithm that performs best on a given grammar. The algorithms are implemented in Python and released under an open-source licence. We start by introducing the necessary concepts. Then we discuss four previously published PM-CFG algorithms, and argue that they are similar. We take Angelov (2009) as a starting point for introducing three new parsing strategies. Finally we discuss various optimizations of the parsing strategies and give a small evaluation.