DMCA
Using COTS Search Engines and Custom Query Strategies at CLEF. Cross-Language Evaluation Forum CLEF (2004)
Venue: | Working Notes for the CLEF 2004 Workshop |
Citations: | 3 - 1 self |
Citations
2472 | An algorithm for suffix stripping - Porter - 1980 |
2187 | Term-weighting approaches in automatic text retrieval
- Salton, Buckley
- 1988
(Show Context)
Citation Context ...rms in a query at about 25%. The second step involves sorting the terms according to their frequencies in the corpora, from least frequent to most frequent. This decision is based on the TF-IDF idea (=-=Salton & Buckley, 1988-=-) which states that a rare, infrequent term is more informative that a common, frequent term. These informative terms allow obtaining precise results. Sorting is useful with the strategy described nex... |
213 | Learning algorithms for keyphrase extraction,
- Turney
- 2000
(Show Context)
Citation Context ...: S1. the use of the title in isolation (this is our baseline); S2. the use of the description in isolation; S3. the use of the combination of the title and the description; S4. the use of Extractor (=-=Turney, 2000-=-) keyphrases extracted from all fields; S5. the use of the title plus the best Extractor keyphrases. In all cases we removed the words trouvez, documents, pertinents and informations from the French t... |
131 | Optimization of relevance feedback weights. - Buckley, Salton - 1995 |
108 |
Fuzzy Sets in Information Retrieval and Clustering Analysis”.
- Miyamoto
- 1990
(Show Context)
Citation Context ...trieval (Buckey and Salton, 1995) and is very effective in CLEF-like settings (Lam-Adesina, 2002). In our experiments, our query expansion strategy relies on a Pseudo-Thesaurus construction approach (=-=Miyamoto, 1990-=-) making use of the fuzzy logic operator of max-min composition (Klir & Yuan, 1995). The approach is to take the N-best search engine results (hereafter N-best corpus), to extend our initial query wit... |
106 | Frequency estimates for statistical word similarity measures.
- Terra, Clarke
- 2003
(Show Context)
Citation Context ...rs across all records for a given word, or, if the records themselves do not provide adequate information, generality is determined by the term frequency in a terabyte-sized corpus of unlabeled text (=-=Terra and Clarke, 2003-=-). When a word is not contained in Termium, then its translation is obtained using Babel Fish. More details about the translation procedure using Termium can be found in (Jarmasz and Barrière, 2004). ... |
14 | Exeter at CLEF 2001: Experiments with Machine Translation for Bilingual Retrieval - Jones, Lam-Adesina - 2001 |
4 | Selective compound splitting of swedish queries for boolean combinations of truncated terms
- Cöster, Sahlgren, et al.
- 2004
(Show Context)
Citation Context ...n at CLEF. Both offer boolean query syntax rather than weighted queries. We realize that this may be a handicap in CLEF-like competitions. Researchers have found strict binary queries to be limiting (=-=Cöster et al., 2003-=-), and most of the best results from previous years rely on systems where each term in a query can be assigned a weight. UC Berkeley performed very well at CLEF 2003 using such a search engine (Chen, ... |
1 |
Babel Fish Translation. http://babelfish.altavista.com/ [Source checked
- Fish
- 2004
(Show Context)
Citation Context ...sely the translation of the target documents. In our experiments we decided to translate the queries using three different methods. As a baseline we use the free Babel Fish translation service (Babel =-=Fish, 2004-=-). We compare this to (1) an automatic translation method which relies on TERMIUM Plus ® (Termium, 2004), an English-French-Spanish terminological knowledge base which contains more than 3 500 000 ter... |
1 |
Cross-Language Retrieval Experiments at
- Chen
- 2002
(Show Context)
Citation Context ..., 2003), and most of the best results from previous years rely on systems where each term in a query can be assigned a weight. UC Berkeley performed very well at CLEF 2003 using such a search engine (=-=Chen, 2002-=-). Yet the availability and quality of commercial search engines make them interesting resources which we feel merit proper investigation. The first search engine that we use is Copernic Enterprise Se... |
1 |
A Terminological Resource and a Terabyte-Sized Corpus for Automatic Keyphrase in Context Translation
- Jarmasz, Barrière
- 2004
(Show Context)
Citation Context ... text (Terra and Clarke, 2003). When a word is not contained in Termium, then its translation is obtained using Babel Fish. More details about the translation procedure using Termium can be found in (=-=Jarmasz and Barrière, 2004-=-). Given a French word, BagTrans assigns probabilities to individual English words that reflect their likelihood of being the translation of that word and then uses the most probable word in the Engli... |
1 |
Termium ® History. http://www.termium.gc.ca/site/histo_e.html [Source checked
- Leonhardt
- 2004
(Show Context)
Citation Context ...ystems. The terms stored in Termium are arranged in records, each record containing all the information in the database pertaining to one concept, and each record dealing with only one concept alone (=-=Leonhardt, 2004-=-). Thus the translation task becomes one of word sense disambiguation, where a term must be matched to its most relevant ik jksrecord; this record in turn offers us standardized and alternative transl... |