MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Intelligent Systems Dept.

Download:
Download as a PDF | Download as a PS
by Tomaz Erjavec
http://nl.ijs.si/telri/pub-tools/tihany-html-paper.ps
Add To MetaCart

Abstract:

This paper gives an overview of language engineering public domain and freely available software. The focus is on lingware tools that are available via the World Wide Web for the Unix platform and concerned with corpora production. Discussed is the relation of tools to standards, in particular SGML, and the benefits and disadvantages of using public domain tools. Given is an overview of a number of generic string processing and corpus conversion tools of statistically based annotations systems and computational linguistic software. Some on-going initiatives on production, standardisation and availability of language tools are mentioned and a number of Web sites, related to the discussed topics are listed.

Citations

1015 The C Programming Language – Kernighan, Ritchie
549 Transformation-based error-driven learning and natural language processing: A case study in part of speech tagging – Brill - 1995
402 The SGML Handbook – Goldfarb - 1991
349 A simple rule-based part of speech tagger – Brill - 1992
283 A practical part-of-speech tagger – Cutting, Kupiec, et al. - 1992
257 A program for aligning sentences in bilingual corpora – Gale, Church - 1993
129 The Icon Programming Language – Griswold - 1983
111 PC-KIMMO: A Two-Level Processor for Morphological Analysis [Occasional – Antworth
71 The formalism and implementation of PATR-II – Shieber, Uszkoreit, et al. - 1983
67 Does Baum-Welch re-estimation help taggers – Elworthy - 1994
49 A modular and flexible architecture for an integrated corpus query system – Christ - 1994
34 Inheritance and constraint-based grammar formalisms – Zajac - 1992
18 A comprehensive unification-based grammar formalism – Dorre, Eisele - 1991
1 ALE user's guide version 2.0. Laboratory for computational linguistics technical report – Carpenter, Penn - 1994
1 sed & awk. O'Reilly & Associates – Doucherty - 1991
1 The DATR papers. Cognitive Science Research Paper CSRP-139 – Evans, Gazdar - 1990
1 Statistical Language Learning. Language and Computers 12 – Charniak - 1994
1 Public domain generic tools: an overview – Erjavec - 1996