• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

2006b. Multinomial Mixture Modelling for Bilingual Text Classification (0)

by J Civera, A Juan
Venue:In Proc. of PRIS’06
Add To MetaCart

Tools

Sorted by:
Results 1 - 3 of 3

Bilingual Machine-Aided Indexing

by Jorge Civera, Alfons Juan
"... The proliferation of multilingual documentation in our Information Society has become a common phenomenon. This documentation is usually categorised by hand, entailing a time-consuming and arduous burden. This is particularly true in the case of keyword assignment, in which a list of keywords (descr ..."
Abstract - Add to MetaCart
The proliferation of multilingual documentation in our Information Society has become a common phenomenon. This documentation is usually categorised by hand, entailing a time-consuming and arduous burden. This is particularly true in the case of keyword assignment, in which a list of keywords (descriptors) from a controlled vocabulary (thesaurus) is assigned to a document. A possible solution to alleviate this problem comes from the hand of the so-called Machine-Aided Indexing (MAI) systems. These systems work in cooperation with professional indexer by providing a initial list of descriptors from which those most appropiated will be selected. This way of proceeding increases the productivity and eases the task of indexers. In this paper, we propose a statistical text classification framework for bilingual documentation, from which we derive two novel bilingual classifiers based on the naive combination of monolingual classifiers. We report preliminary results on the multilingual corpus Acquis Communautaire (AC) that demonstrate the suitability of the proposed classifiers as the backend of a fully-working MAI system. 1.
(Show Context)

Citation Context

...re is little difference between the performance of the monolingual and bilingual classifiers. Even though, this is not the rule but the exception, as revealed in previous work (Juan and Civera, 2005; =-=Civera and Juan, 2005-=-). 4.3. Discussion The excellence of these results should be assessed bearing in mind the complexity of this task and how MAI systems work. On the one hand, professional indexers do not com54.0 52.0 5...

Bilingual Text Classification using the IBM 1 Translation Model

by Jorge Civera, Alfons Juan-císcar
"... Manual categorisation of documents is a time-consuming task that has been significantly alleviated with the deployment of automatic and machine-aided text categorisation systems. However, the proliferation of multilingual documentation has become a common phenomenon in many international organisatio ..."
Abstract - Add to MetaCart
Manual categorisation of documents is a time-consuming task that has been significantly alleviated with the deployment of automatic and machine-aided text categorisation systems. However, the proliferation of multilingual documentation has become a common phenomenon in many international organisations, while most of the current systems has focused on the categorisation of monolingual text. It has been recently shown that the inherent redundancy in bilingual documents can be effectively exploited by relatively simple, bilingual naive Bayes (multinomial) models. In this work, we present a refined version of these models in which this redundancy is explicitly captured by a combination of a unigram (multinomial) model and the well-known IBM 1 translation model. The proposed model is evaluated on two bilingual classification tasks and compared to previous work. 1.

Contents

by Juan
"... ar ..."
Abstract - Add to MetaCart
Abstract not found
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University