Results 1 -
3 of
3
A spelling corrector for Basque based on morphology
, 1997
"... This paper describes the components used in the elaboration of the commercial Xuxen spelling checker/corrector... ..."
Abstract
-
Cited by 12 (6 self)
- Add to MetaCart
This paper describes the components used in the elaboration of the commercial Xuxen spelling checker/corrector...
Designing Spelling Correctors for Inflected Languages Using Lexical Transducers
, 1999
"... This paper describes the components used in the design of the commercial XuxenII spelling checker/corrector for Basque. It is a new version of the Xuxen spelling corrector (Aduriz et al., 97) which uses lexical transducers to improve the process. A very important new feature is the use of user dicti ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
This paper describes the components used in the design of the commercial XuxenII spelling checker/corrector for Basque. It is a new version of the Xuxen spelling corrector (Aduriz et al., 97) which uses lexical transducers to improve the process. A very important new feature is the use of user dictionaries whose entries can recognise both the original and infiected forms. In languages with a high level of inflection such as Basque spelling checking cannot be resolved without adequate treatment of words from a morphological standpoint. In addition to this, the morphologim cal treatment has other important features: coverage, reusability of tools, orthogonality and secu- rity. The tool is based in lexical transducers and is built using the fst library of Inxight . A lexical transducer (Karttunen, 94) is a finite-state au- tomaton that maps infiected surface forms to lexical forms, and can be seen as an evolution of two- level morphology (Koskenniemi, 83) where the use of diacritics and homographs can be avoided and the intersection and composition of transducers is possible. In addition, the process is very fast and the transducer for the whole morphological description can be compacted in less than 1Mbyte. The design of the spelling corrector consists of four main modules: the standard checker, the recogniser using user-lexicons, the corrector of linguistic variants-proposals for dialectal uses and competence errors- the corrector of typographical errors An important feature is its homogeneity. The different steps are based on lexical transducers, far from ad-hoc solutions
Using Finite State Technology in Natural Language Processing of Basque
"... Abstract. This paper describes the components used in the design and implementation of NLP tools for Basque. These components are based on finite state technology and are devoted to the morphological analysis of Basque, an agglutinative pre-Indo-European language. We think that our design can be int ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. This paper describes the components used in the design and implementation of NLP tools for Basque. These components are based on finite state technology and are devoted to the morphological analysis of Basque, an agglutinative pre-Indo-European language. We think that our design can be interesting for the treatment of other languages. The main components developed are a general and robust morphological analyser/generator and a spelling checker/corrector for Basque named Xuxen. The analyser is a basic tool for current and future work on NLP of Basque, such as the lemmatiser/tagger Euslem, an Intranet search engine or an assistant for verse-making.

