| Harman, D. (1996). Overview of the fifth text retrieval conference (TREC5) . In Proceedings of the TREC Conference, Gaithersburg, MD. |
....and from the queries used. A detailed description of the system can be found in (Tombros Sanderson, 1998) here we shall briefly describe the summary generation process. The document collection to be summarised comprised news articles of the Wall Street Journal taken from the TREC collection (Harman, 1996). Each individual document of the collection was passed through the summarisation system, and as a result a score for each sentence of each document was computed. This score represents the sentence s importance for inclusion in the document s summary. Scores are assigned to sentences by examining ....
Harman, D. (1996). Overview of the fifth text retrieval conference (TREC5) . In Proceedings of the TREC Conference, Gaithersburg, MD.
....from the queries used. A detailed description of the system can be found in (Tombros Sanderson, 1998) here we shall briefly describe the summary generation process. The document collection to be summarised comprised news articles of the Wall Street Journal (WSJ) taken from the TREC collection (Harman, 1996). Each individual document of the collection was passed through the summarisation system, and as a result a score for each sentence of each document was computed. This score represents the sentence s importance for inclusion in the document s summary. Scores are assigned to sentences by examining ....
Harman, D. (1996). Overview of the fifth text retrieval conference (TREC-5). In Proceedings of the TREC Conference, Gaithersburg, MD, USA.
....to explore the effect of collection size on the retrieval methods. Table 1 shows the amount of data included in each set. There were 49 queries in the test set, each specifically designed to select a single document from the corpus. This configuration is called a Known Item Retrieval (KIR) task [9]. It is assumed, but not established, that all the remaining documents in the set are irrelevant to the query. Although KIR is a somewhat unrealistic retrieval scenario, it is easy to set up, and was used by NIST for the initial Spoken Document Retrieval (SDR) task at the TREC 6 conference. In ....
....text approximately 85 of time. 1 By reducing the beam used during acoustic search. 4. 4 Information Retrieval Runs The information retrieval system was run on each of the three, successively larger, document sets, and for each of three different text conditions, using the LNU relevance measure [9]. Table 2 and Figure 3 show that, by using the LOD metric, the degradation in retrieval performance for the recognized texts could be reduced by approximately 26 for the smallest, 38 for the middle sized, and by 13 for the largest document set. Number of Documents Reference CSR Top 1 CSR Using ....
E. Vorhees, D. Harman, "Overview of the Fifth Text REtrieval Conference," Proc. of TREC-5, Nov. 1996.
....used in the experimentation. 5.1 The Test Collection Since to the best of my knowledge there is no test collection available with spoken queries, we had to generate it from an existing textual collection 2 . The collection used is the TREC 5 B a subset of the collection generated for TREC 5 [13]. The collection is made of selected full text articles of the Wall Street Journal (years 1990 92) Some of the characteristics of this test collection are reported in table 1. An example of a document of the WSJ collection is reported in the following. Notice that documents are marked up using ....
D. Harman. Overview of the fifth text retrieval conference (TREC-5). In Proceedings of the TREC Conference, Gaithersburg, MD, USA, November 1996.
....most of it based on the unranked Boolean retrieval model of information retrieval (IR) van Rijsbergen, 1979 ] Most of the recent research on document filtering is based on the assumption that effective IR techniques are also effective document filtering techniques. The TREC conference (see [ Harman, 1996 ] for the last TREC conference) is a good example of this practice. Recently, the term information filtering (IF) has started being used in place of the old style document filtering, to emphasise the possibility of selectively distributing multimedia information. In the context of this paper we ....
....stream of documents then w C is the null vector. 5 Evaluation Framework and Results In the context of the work reported in this paper we intended to evaluate the performance of our IF learning model, in particular when little training data is provided. The collection we used is the TREC 5 B [ Harman, 1996 ] a subset of the collection used in the experiments done in 1996 in the context of the TREC 5 initiative. The collection is made of 3 years (1990 92) of selected full text articles of the Wall Street Journal. The total number of documents (articles) in the collection is about 75:000. Each ....
D. Harman. Overview of the fifth text retrieval conference (TREC-5). In Proceeding of the TREC Conference, Gaithersburg, MD, USA, November 1996.
....assumption that effective IR techniques were also effective IF techniques. Many of the IF approaches proposed at the TREC conferences, for example, were based on past successful IR approaches. This view has been challenged recently by Callan [8] and by the proposer of the TREC 5 Filtering track [14]. The idea is different techniques are required in order to design effective IF and IR systems. In particular, IF requires more sophisticated techniques of learning through relevance feedback than IR, since it is important to be able to model the user information need with the most efficient use ....
....systems have been proposed. One application area that has been heavily targeted is news filtering [18] Moreover, much effort has been devoted to IF in the context of the TREC initiative, as the increasing number of participants to the two sessions of routing and filtering proves (see TREC 5 [14], for example) The area of IF brings together many different experiences from other areas, like machine learning, data mining, knowledge representation, and so on. The main contribution of IR, and in particular of TREC, to the IF community is in providing sound evaluation techniques. We believe ....
[Article contains additional citation context not shown here]
D. Harman. Overview of the fifth text retrieval conference (TREC-5). In Proceeding of the TREC Conference, Gaithersburg, MD, USA, November 1996.
....information retrieval based on thesaurus based query expansion approach performed over a collection of comparable multilingual documents. 43 The Eurospider retrieval system is based on fully automatic indexing (no manual indexing required) The EuroSpider system has been evaluated at TREC 5 [Harman96] It provides many functions of the new generation retrieval systems such as relevance ranking, word normalization, relevance feedback, automatic indexing. Eurospider s architecture allows powerful integration of database management systems and advanced retrieval functions. When adding the ....
D. Harman. Overview of the Fifth Text REtrieval Conference (TREC5). NIST, 1996
....of the collection and from the queries used. A detailed description of the system can be found in [24] here we shall briefly describe the summary generation process. The document collection to be summarised comprised news articles of the Wall Street Journal (WSJ) taken from the TREC collection [11]. Each individual document of the collection was passed through the summarisation system, and as a result a score for each sentence of each document was computed. This score represents the sentence s importance for inclusion in the document s summary. Scores are assigned to sentences by examining ....
D. Harman. Overview of the fifth text retrieval conference (TREC-5). In Proceedings of the TREC Conference, Gaithersburg, MD, USA, November 1996.
....extent majority of documents in the two sets have the same source for documents: WSJ, AP, and ZIFF. The test set does have 5 This is one of the main reasons why the general performance in the TREC 4 and the TREC 5 routing tasks is much lower than the performance for the TREC 3 routing task. [12, 13] K (rank cut off) 1,000 2,000 4,000 6,000 8,000 10,000 TREC 3 Average 0.4368 0.4415 0.4326 0.4279 0.4235 0.4213 Precision TREC 3 vs. No QZ 10.2 11.4 9.2 8.0 6.9 6.3 (0.3962) TREC 4 Average 0.3223 0.3535 0.3735 0.3777 0.3776 0.3773 Precision TREC 4 vs. No QZ 9.2 0.5 5.2 ....
D. K. Harman. Overview of the fifth Text REtrieval Conference (TREC-5). In Proceedings of the Fifth Text REtrieval Conference (TREC-5), 1997 (to appear).
....that effective IR techniques were also effective IF techniques. Many of the IF approaches proposed at the TREC conferences, for example, were based on past successful IR approaches. This view has been criticised recently by Callan [Callan, 1996] and by the proposer of the TREC 5 Filtering track [Harman, 1996]. The idea is that alternative techniques to IR are required in order to design effective IF systems. In particular, IF requires more sophisticated techniques of learning than IR, since it is important to be able to model the user information need with the most efficient use of the information the ....
....performance by a training set of a balanced number of relevant and non relevant documents (8 relevant and 8 non relevant, 16 relevant and 16 non relevant) For low values of recall ProFile is insensitive with respect to cutting off the lowest frequent terms. The collection we used is the TREC 5 B [Harman, 1996] a subset of the collection used in the experiments done in 1996 in the context of the TREC 5 initiative. The collection is made of 3 years (1990 92) of selected full text articles of the Wall Street Journal. The total number of documents (articles) in the collection is about 75:000. Each ....
Harman, D. (1996). Overview of the fifth text retrieval conference (TREC--5). In Proceeding of the TREC Conference, Gaithersburg, MD, USA.
No context found.
Harman, D.K. (1997b). Overview of the Fifth Text REtrieval Conference (TREC-5). In The Fifth Text REtrieval Conference (TREC-5).
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC