45 citations found. Retrieving documents...
D. Harman. Overview of the fourth Text Retrieval Conference (TREC-4). In Proceedings of the Fourth Text Retrieval Conference, 1996.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents

The Philosophy of Information Retrieval Evaluation - Voorhees   (2 citations)  (Correct)

....not have been judged and would have been assumed to be not relevant. Presumably, there are other documents that didn t make it into the pools that also would have been judged relevant. Indeed, a test of the TREC 2 and TREC 3 collections demonstrated the presence of unjudged relevant documents [8]. In this test, relevance assessors judged the documents in new pools formed from the second 100 documents in the ranked results submitted by participants. On average, the assessors found approximately one new relevant document per run (i.e. one relevant document that was not in the pool created ....

Donna Harman. Overview of the fourth Text REtrieval Conference (TREC-4). In D. K. Harman, editor, Proceedings of the Fourth Text REtrieval Conference (TREC-4), pages 1-23, October 1996. NIST Special Publication 500-236.


Improving Automatic Query Expansion - Mandar Mitra Amir   (47 citations)  (Correct)

....documents assumed relevant is actually non relevant, then the words added to the query (drawn mostly from these documents) are likely to be unrelated to the topic and the quality of the documents retrieved using the expanded query is likely to be poor. Consider query 203 from the TREC collection [9], for example. What is the economic impact of recycling tires For this query, out of the first 20 documents retrieved by our system, only 4 are relevant. Most of the remaining documents discuss recycling of plastics, cans, glass, etc. without referring to tires. When these 20 This study was ....

....investigated more extensively in the next section. 4 Experiments and Results In order to determine the usefulness of the techniques described above, we test them on a variety of tasks. We use the TREC collections in our experiments. Our methods are evaluated on the adhoc tasks for TRECs 3 6 [8, 9, 20, 21]. The query sets and document collections used in these tasks are shown in Table 1. Since we are interested in studying short queries, we use only the Description field for queries 151 300. For the TREC 6 queries (numbered 301 350) we use the Title field in addition a. Our experiments use ....

D. K. Harman. Overview of the Fourth Text REtrieval Conference (TREC-4) . In D. K. Harman, editor, Proceedings of the Fourth Text REtrieval Conference (TREC-d). NIST Special Publication 500- 236, October 1996.


Fusion of Information Retrieval Engines (FIRE) - Mounir, Goharian, Mahoney.. (1998)   (1 citation)  (Correct)

....each with different indexing. The work was predominantly theoretical, and no experimental evaluation of the derived model and the selection criterion were presented. Finally, in 1995, an entire track within the TREC activities was devoted to the concept of merging separate databases [Harm95]. For a recent overview of the NIST TREC activities, see [Harm98] Smeaton Crimmins [Smea98] created a Java based user interface for multiple search engines. Individual search engine results are merged and displayed. No accuracy measurements were presented. For a general overview of information ....

Donna Harman, "Overview of the Fourth Text REtrieval Conference (TREC-4) TREC-4 Proceedings, 1995.


How Reliable are the Results of Large-Scale Information Retrieval.. - Zobel (1998)   (15 citations)  (Correct)

....remain unidentified only 5040 by the main runs and 5524 by all contributors for TREC 5, well short of our predictions. The assumption that unjudged documents are irrelevant is not well founded. These results are confirmed by observed numbers of new relevant documents. As reported by Harman [2], subsequenttothemainTREC3eventpoolswereformedfrom depths 101 200 and assessed in a search for further relevant documents. This experiment found around 35 more relevant documents per query, or around 9000 relevant documents in total. Our results show that, in addition to the documents found to ....

D. Harman. Overview of the fourth text retrieval conference (TREC-4). In D. Harman, editor, Proc. Text Retrieval Conference (TREC), October 1995.


Collaborative Filtering - Griffith, O'Riordan (2000)   (Correct)

....and the user request is a profile representing a long term interest. For an overview of the differences and similarities between retrieval and retrieval systems the user is directed to [4] Newer systems and theories are being developed every year with results presented in conferences such as TREC[9]. According to Malone[13] there are three classes of filtering techniques: cognitive, based on the content of articles (which has received the most attention in the past) social (or collaborative) based on human judgments (which is the focus of this paper) and economic, based on the cost of ....

D. Harman. Overview of the fourth text retrieval conference (trec 4). TREC-4 Proceedings, 1995.


The Effectiveness of a Dictionary-Based Technique for.. - Adriani, Croft (1997)   (2 citations)  (Correct)

....terms are then added to the query resulting in an expanded query. The query expansion can be performed before, after, or both before and after the query translation. In this study we compare all of the three schemes. We construct the Indonesian queries based on TREC s Spanish query topics SP26 45 [Harman, 1995] by translating the queries to Indonesian and modifying them to make them relevant with Indonesia s national affairs. The English queries are chosen from TREC topics 151 171. To obtain the terms for the pre translation query expansion, the local feedback process is performed on the Associated ....

Harman, Donna. Overview of the Fourth Text Retrieval Conference (TREC-4). Proceedings of the Fourth Text Retrieval Conference (TREC-4), 1995.


Retrieval Effectiveness Of Various Indexing Techniques On.. - Adriani, Croft (1997)   (Correct)

....newspaper publisher s office. A manual automatic combined index was obtained by first adding the manual index terms to the text documents, and then built the index automatically using INQUERY. The natural language queries for our experiments were created based on the TREC short queries topics [Harman, 1995]. Query topics which are relevant to the Indonesian text documents were selected. We used 25 queries for all of the experiments. The phrase and Boolean queries were constructed manually transforming from the natural language queries. The effectiveness of the indexing techniques was measured by ....

Harman, Donna. Overview of the Fourth Text Retrieval Conference (TREC-4). Proceedings of the Fourth Text Retrieval Conference (TREC-4), 1995.


Predicting the Effectiveness of Nave Data Fusion on the Basis of.. - Ng (2000)   (Correct)

....Typically this range is taken equal to the difference between the score of the highest ranked document and the score of the document at some fiducial position, such as the 100 th or 1000 th document in the list. Since our experiments were conducted using the TREC data on the routing task (Harman 1996), for which systems were encouraged to produce a list of 1000 ranked documents, the range that we use is determined by the first and 1000 th retrieved documents. Previous work on the predictive problem has concentrated on attempting to predict the performance of the fused system (Vogt and ....

....documents retrieved through that point in the list, represented by g(n) The recall at position n is defined as g(n) G, where G is the total number of documents in the collection that are relevant to the present quest. The precision at position n is defined as p n =g(n) n. In the TREC setting (Harman 1996), p 100 is one of several published indicators of the effectiveness of the scheme. It is the most finely resolved indicator that does not depend on unproven assumptions about the overall nature of the problem. The other candidates, such as p 1000 , involve the presumption that all documents not ....

[Article contains additional citation context not shown here]

Harman, D. (1996). Overview of the fourth Text REtrieval Conference. In D. Harman (ed.) Proceedings of the Fourth Text Retrieval Conference. Washington. DC: GPO.


Automated Query Generation For Embedded Information Retrieval - Kulyukin   (Correct)

....i.e. the number of nonrelevant documents before the # th relevant one. For example, if # ##, the numbers of nonrelevant documents before the 1st relevant one in all retrieved sets are added and divided by the number of submitted queries. The two standard evaluation metrics, precision and recall [Harman 1996; Salton and McGill 1983] were not used. Recall is the ratio of retrieved relevant documents to all relevant documents; precision is the ratio of retrieved relevant documents to all retrieved documents. Recall was not used, because its assumption that all relevant documents are known for every ....

Harman, D. 1996. Overview of the Fourth Text REtrieval Conference. Proceedings of the Fourth Text REtrieval Conference (TREC-4), ACM Press, 1996.


Optimal High Performance Parallel Text Retrieval via.. - Mamalis, Spirakis, Tampakas (1997)   (2 citations)  (Correct)

....collection data in the recent Information Retrieval research. They have been widely used during the last five years and they are the official data used for the purposes of the well known annual TREC conferences (sponsored by the National Institute of Standards and Technology of the U.S. A, see [24]) Our experimental efforts have been based on the efficient simulation of an ideal fat tree over the 2D mesh structured network of Parsytec parallel machine, via the specific embedding method presented in the Appendix. This embedding offers a reliable simulation of the high capacity channels of ....

D. Harman, Overview of The Fourth Text Retrieval Conference, in the proceedings of the fourth (4th) Text Retrieval Conference, TREC 95, November 25-27, Gaithersburg, Md, USA, pp. 1-20, National Institute of Standards and Technology, Special Publication, 1995. [Electronic proceedings at http://www-nlpir.nist.gov/TREC].


A Hidden Markov Model Information Retrieval System - Miller, Leek, Schwartz (1999)   (40 citations)  (Correct)

....this ranking with that given by the well known tf :idf measure. In particular, we used the tf :idf measure presented in [16] and reproduced in Figure 3. For the HMM transition probabilities, we used the EM algorithm to train the value of a 1 =0:3 using training examples from the TREC 4 collection [7]. Table 1 shows the non interpolated average precision (AveP) achieved by each ranking measure for a variety of test conditions. 2 In all cases, the HMM system dramatically outperforms tf :idf , exceeding it by as much as 8 percentage points in absolute terms. Others [24] have reported somewhat ....

D. Harman, "Overview of the Fourth Text REtrieval Conference." In D. K. Harman, editor, Proceedings of the Sixth Text Retrieval Conference (TREC-6), NIST Special Publication 500-236, pp. 1-24 (1996).


Comparing the Performance of Database Selection.. - French, Powell, Callan, .. (1999)   (28 citations)  (Correct)

....test environments are idiosyncratic in both data and evaluation measures, making it impossible to compare results. In French et al. 9] we proposed a test environment for the systematic study of distributed information retrieval algorithms. Our testbed is based on the TIPSTER data used in the TREC[16] conferences. We decompose the large collections into smaller subcollections that serve as hypothetical sites in our distributed information retrieval test environment. The data is decomposed by source, year, and month resulting in 236 sites. We used TREC topics 51 150 as the test queries in our ....

D. Harman. Overview of the Fourth Text Retrieval Conference (TREC-4). In Proceedings of the Fourth Text Retrieval Conference (TREC-4), 1996.


The Impact of Database Selection on Distributed Searching - Powell, French, Callan (2000)   (15 citations)  (Correct)

....environment in which database selection was not used, while Xu and Croft used CWI for all experiments reported in [23] 4.1 Testbeds We used three different document testbeds in our experiments. All three testbeds are based upon 3 gigabytes of data available to participants in the TREC 4 [14] experiments 1 . The data is spread over several years and comes from seven (7) primary sources: AP Newswire (AP) Wall Street Journal (WSJ) Computer Select (ZIFF) the Patent Office (PAT) San Jose Mercury News (SJMN) Federal Register (FR) and Department of Energy (DOE) The three testbeds ....

D. Harman. Overview of the Fourth Text Retrieval Conference (TREC-4). In Proceedings of the Fourth Text Retrieval Conference (TREC-4), 1996.


High Performance Parallel Text Retrieval over Large.. - Mamalis, Spirakis.. (1998)   (2 citations)  (Correct)

.... and efficient (in terms of fast, interactive response times) user access; the academic information retrieval research has also been clearly oriented to the developement, usage and integrated implementations and testings over large scale standard document collections (i.e. see TREC efforts [16]) Therefore, with respect to the above needs, parallel processing is naturally called to offer a reliable solution towards the direction of enhancing at least one of the above mentioned user demands: fast and higly interactive access over very large amounts of textual data. Considering the most ....

....concerning the high performance of PFIRE system various efficiency experiments have been performed over the GCel3 512 Parsytec machine with use of the virtual tree topologies described in section 5.2. The collection data used were taken from the well known large scale TIPSTER TREC collections [16]. Specifically, we ve used the WSJ [ 90 92] part, out of the whole set of TREC collections, which consists of approximately 75000 documents 250MB of text data. Moreover, in order to obtain collections of varying size, we ve produced appropriate fractions and simulated multiples of the WSJ ....

D. Harman, "Overview of the Fourth Text REtrieval Conference", Fourth Text REtrieval Conferemce, TREC'95, November 25--27, Gaithersburg, Md. USA, NIST Special Publication, pp. 1--20, electronic proceedings available at "http://wwwnlpir. nist.gov/TREC", 1995.


Comparing the Performance of Database Selection.. - French, Powell, Callan, .. (1999)   (28 citations)  (Correct)

....test environments are idiosyncratic in both data and evaluation measures, making it impossible to compare results. In French et al. 9] we proposed a test environment for the systematic study of distributed information retrieval algorithms. Our testbed is based on the TIPSTER data used in the TREC[16] conferences. We decompose the large collections into smaller subcollections that serve as hypothetical sites in our distributed information retrieval test environment. The data is decomposed by source, year, and month resulting in 236 sites. We used TREC topics 51 150 as the test queries in our ....

D. Harman. Overview of the Fourth Text Retrieval Conference (TREC-4). In Proc. TREC-4, 1996.


The TREC-5 Filtering Track - Lewis (1997)   (20 citations)  (Correct)

....For that reason, total utility for filtering runs was estimated using samples of the submitted documents. Two different sampling and estimation methods were used in the TREC 5 filtering track, as described below. 5. 1 Pooling The first approach to sampling was the usual TREC pooling strategy [4]. This approach assumes that some known pool of documents contains all the relevant documents in the test set. The pool for the TREC 5 filtering task consisted of all documents judged for the topic in the main routing task, plus all documents judged for the topic for the filtering task, as chosen ....

Donna Harman. Overview of the fourth Text REtrieval Conference (TREC-4). In D. K. Harman, editor, The Fourth Text REtrieval Conference (TREC-4), Gaithersburg, MD, 1996. U. S. Dept. of Commerce, National Institute of Standards and Technology.


Dialogue Management in Vector-Based Call Routing - Chu-Carroll, Carpenter (1998)   (Correct)

....it is trained fully automatically to both route and disambiguate requests, and 3) its performance is sufficient for use in the field, substantially improving on that of previous systems. 2 Related Work Call routing is similar to topic identification (McDonough et al. 1994) and document routing (Harman, 1995) in identifying which one of n topics (destinations) most closely matches a caller s request. Call routing is distinguished from these activities by requiring a single destination, but allowing a request to be refined in an interactive dialogue. We are further interested in carrying out the ....

D. Harman. 1995. Overview of the fourth Text REtrieval Conference. In Proc. TREC.


Learning Routing Queries in a Query Zone - Singhal (1997)   (21 citations)  (Correct)

....used. We approximate a query domain by a query zone. Experiments show that routing profiles learned from a query zone are 8 12 more effective than the profiles generated when no query zoning is used. 1 Background Document routing is an important problem in the field of information retrieval. [12] When a user has marked several articles as relevant to his her information need, a system should be able to automatically learn the user s profile and should be able to route (send) new, potentially interesting, articles to the user. This problem has also been called as selective dissemination ....

....run these queries on the training corpus retrospectively and select the query that has the best average precision performance on the training corpus. Parameters involved: ff, fi, fl, and a list of rank cutoffs. 4 This situation arises in TREC because pooling is used for relevance judgments. [11, 12] We assume that all the unjudged documents are non relevant. 5 Related Work In [15] Hull, and in [28] Schutze, Hull, and Pedersen want to reduce the dimensionality of the feature space dramatically for use with strong learning methods. They use singular valued decomposition (SVD) of the document ....

[Article contains additional citation context not shown here]

D. K. Harman. Overview of the fourth Text REtrieval Conference (TREC-4). In Proceedings of the Fourth Text REtrieval Conference (TREC-4), pages 1-- 24. NIST Special Publication 500-236, October 1996.


Boolean System Revisited: Its Performance and its Behavior - Allan Lu (1996)   (1 citation)  (Correct)

....the Boolean searcher would be reluctant to migrate to the natural language search system if the Boolean system tends to find different relevant documents. Thus it is desirable to look beyond the traditional effectiveness measures when comparing the Boolean system and the ranking system. TREC4 [9] presents an excellent opportunity to conduct a thorough comparison of the Boolean system and the relevance ranking system. First of all, the test data is sufficiently large and diverse, putting adequate stress on the involved systems. Particularly neither the Boolean search subject nor the ....

....99 211 150 236 16 212 63 237 181 213 31 238 65 214 10 239 40 215 18 240 288 216 32 241 17 217 19 242 61 218 5 243 21 219 21 244 77 220 18 245 23 221 247 246 6 222 56 247 53 223 148 248 28 224 47 249 20 225 127 250 164 TABLE 1. Boolean Answer Set Sizes 6 average macro precision and macro recall [9] are 0.38 and 0.22, respectively. The composed E measure with alpha=0.5 is 0.76. The relevance ranking systems in this study are the four best performers in the automatic and manual ad hoc tests [6] Since both the Boolean system and the manual ad hoc systems had human involvement, their results ....

Harman, D. K., editor. (1996) Overview of the Fourth Text REtrieval Conference (TREC-4), to be published by NIST.


Pharos: A Scalable Distributed Architecture for Locating.. - Dolin (1996)   (9 citations)  (Correct)

....retrieval (Salton, 1989) for example SMART (Salton and McGill, 1983) Latent Semantic Indexing (LSI) Berry and Dumais, 1995) and others This research was partially supported by NSF NASA DARPA under grant number IRI94 11330. Dolin, 2 involved with the NIST Text REtrieval Conferences (TREC) (Harman, 1996). Our work, though, focuses on the selection of collections of documents rather than on particular documents themselves. Thus, we are not directly addressing the problem of a user finding a particular document via author, title, keyword, etc. However, a particular author or title is generally ....

....the degree of success of finding the best sources. Such an estimate requires the quantification of a few parameters. When dealing with a standard, single text document database, precision usually gives a measure of how many of the returned documents are considered to be relevant to the query (Harman, 1996). This definition does not extend naturally to the problem of locating sources. A source could be considered relevant if it contains even a single relevant document, resulting in a relevance test that is too broad and unintuitive. Instead, we are interested in a measure that is higher for sources ....

[Article contains additional citation context not shown here]

Harman, D. (1996). Overview of the Fourth Text REtrieval Conference (TREC-4). Gaithersburg, MD.


Information Retrieval by Means of Word Sense - Disambiguation Alfonso Ure   (Correct)

No context found.

D. Harman. Overview of the fourth Text Retrieval Conference (TREC-4). In Proceedings of the Fourth Text Retrieval Conference, 1996.


Genetic Programming-Based Discovery of Ranking Functions.. - Fan, Gordon, Pathak (2005)   (Correct)

No context found.

Harman, D.K. Overview of the fourth Text REtrieval Conference (TREC-4). In D.K. Harman (ed.), Proceedings of the Fourth Text Retrieval Conference. Gaithersburg, MD: National Institute of Standards and Technology, 1996, pp. 1--24.


When one Sample is not Enough: Improving Text Database.. - Ipeirotis, Gravano (2004)   (Correct)

No context found.

D. Harman. Overview of the Fourth Text REtrieval Conference (TREC-4). In NIST Special Publication 500-236: The Fourth Text REtrieval Conference (TREC-4), 1996.


Text Augmentation: Inserting XML tags into natural language text.. - Yeates (2003)   (Correct)

No context found.

Donna K. Harman. Overview of the fourth text retrieval conference. In The Fourth Text REtrieval Conference, pages 1--24, Gaithersburg, Maryland, USA, November 1-3 1995. National Institute of Standards and Technology.


IRIS at TREC-7 - Kiduk Yang Kelly (1999)   (1 citation)  (Correct)

No context found.

Harman, D. (1996). Overview of the Fourth Text REtrieval Conference (TREC-4). In D. K. Harman (Ed.), The Fourth Text REtrieval Conference (TREC-4) (NIST Spec. Publ. 500-236, pp. 25-48). Washington, DC: U.S.

First 50 documents

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC