| Blair, David C., and Maron, M. E. "An Evaluation of Retrieval Effectiveness for a Full-Text DocumentRetrieval System," Communications of the ACM (28:3), March 1985, pp. 289-299. |
....with every new document. This premise was initially demonstrated in [47] for a small document collection, but its validity for larger collections was previously not investigated. Experimentation using only very small collections is known to be not necessarily indicative for larger collections [2]. Researchers demonstrated that it is not necessary to update the idf for every new document [13] In an unpublished study, the idf update interval was evaluated by using a training collection of fifty percent of the document collection. Sequences of text were added to the collection, and the ....
D.C. Blair and M.E. Maron. An evaluation of retrieval effectiveness for a full-text document-retrieval system. Communications of the ACM, 28(3):289--299, 1985.
....fair to Boolean queries. The number of conjunctions in Boolean queries derived from the topic descriptions in natural language tends to be high (i.e. query exhaustivity was high) In queries without expansion, precision collapses already at low recall levels because of the exact match requirement [2, 20]. Similar drop is not likely to happen in best match queries. In the latter study [12] Boolean queries contained few conjunctions (i.e. query exhaustivity was low) to guarantee high recall. Best match queries were punished since higher query exhaustivity could have improved their performance ....
Blair, D.C. & Maron, M.E. (1985). An evaluation of retrieval effectiveness for a full-text document retrieval system. Communications of the ACM (28)3, 289- 299.
....of research output (see e.g. 8] and other TREC reports) and also in the slow development of system oriented evaluation methods for the Boolean IR model. Research on operational systems has focused on Boolean IR systems but the contribution on the development of methods has been very slight [3, 28]. Research within the Cranfield paradigm has shared a very critical attitude towards the Boolean IR model [7] The studies of Salton [21] and Turtle [30] are examples of attempts to show empirically the overall superiority of the best match IR models over the Boolean IR model. The results of some ....
....in the well known STAIRS study, the searchers had a predefined goal to locate at least 75 per cent of all relevant documents. It turned out that only less than 20 per cent of relevant documents were found. On the other hand, the average precision of the test queries was as high as 79 per cent [3]. The searchers were obviously formulating high precision queries although they were asked to work towards high recall. The latter two features (no ranking, little control over the output size) of the Boolean IR model cause problems in measuring the performance at the standard point of operation ....
[Article contains additional citation context not shown here]
Blair, D.C. & Maron, M.E. (1985). An evaluation of retrieval effectiveness for a full-text document retrieval system. Comm. of the ACM (28)3, 289-299.
....systems. Hersh and Hickam (1995) survey prior research comparing Boolean and best match searching; results are not completely consistent, but most of the small number of studies support Turtle. Furthermore, there are substantial theoretical grounds for expecting such a result. For example, Blair and Maron (1985) argue strongly that no conceivable full text retrieval system can achieve consistently good recall with a large database, but their arguments apply only to exact match evaluation: there is no theoretical reason why best match systems cannot produce good recall with databases of any size. Finally, ....
Blair, David and Maron, M.E. (1985). An Evaluation of Retrieval Effectiveness for a Fulltext Document-Retrieval System. CACM 28,3 (March 1985), 289-- 299.
....Viles and French [74] showed that the inverse document frequencies did not have to be updated very 75 often to assure good accuracy, but this work was done on a very small document collection. Experimentation on small document collections was shown to be nonindicative for larger collections [75]. We tested the hypothesis that a sufficient amount of training data is sufficient to assign the term weights and once the weights are assigned, they need not be updated frequently [19, 20] We tested a variety of different training set sizes (10 MB, 20 MB, 40MB, 80MB, and 160 MB) for a 320MB ....
Blair D., Maron M., "An Evaluation of Retrieval Effectiveness for a Full-Text Document- Retrieval System", Communications of the ACM, 28(3):289-299.
....such as response time, disk I O, etc. Most research in IR has centered around effectiveness since it is well accepted that current retrieval systems exhibit mediocre accuracy. A classic study examined real documents and real user requests and found that many relevant documents were not found [3]. Moreover, users have become complacent in their expectation of accuracy of information retrieval systems [11] Therefore, many information retrieval researchers focus on effectiveness and tend to ignore efficiency since, given poor effectiveness, efficiency is of secondary concern. For example, ....
....to be updated with every new document. This premise was initially demonstrated in [31] for a small document collection, but its validity for larger collections was previously not investigated. Experimentation using only very small collections is known to be nonindicative for larger collections [3]. It has been demonstrated that it is not necessary to update the idf for every new document [10] In that study, the idf update interval was evaluated by using a training collection of fifty percent of the document collection. Sequences of text were added to the collection, and the idfs were ....
D.C. Blair and M.E. Maron. An evaluation of retrieval effectiveness for a full-text documentretrieval system. Communications of the ACM, 28(3):289--299, 1985.
....but do match one which is semantically equivalent, for example Sarich engine , motor developed by Ralph Sarich etc. Another complication arises from the expression of needs in terms of explicit or implicit document metadata, such as all documents authored by Doug Engelbart . Blair and Maron [2] discuss retrieval effectiveness in a high recall legal application. In each of the first two types, evaluation requires access to only very few documents. In Type B, relevance, even strong relevance, is usually not sufficient and evaluation must be based on asessment of whether a document is the ....
David C. Blair and M.E. Maron. An evaluation of retrieval effectiveness for a full-text document-retrieval system. Communications of the ACM, 28(3):289--299, 1985.
....retrieval, because the documents are mostly unstructured and do not permit standard database solutions. This is not a trivial problem for a computer to solve, and a number of studies have amply illustrated the difficulties in retrieving documents based on their content. For example, Blair Maron [8] report that users of one mainstream text retrieval system, IBM s STAIRS STorage And Information Retrieval System, were able to find only 20 of all the useful documents in a collection. However, users expectations were much higher: They believed that 75 of the useful documents would be ....
David C. Blair and M. E. Maron. An evaluation of retrieval effectiveness for a fulltext document retrieval system. Communications of the ACM, 28(3):289--299, March 1985.
....of retrieved material that is relevant, i.e. it measures how well the system retrieves only the relevant components. Recall can also be interpreted as the probability that a relevant component will be retrieved, and precision as the probability that a retrieved component will be relevant [Blair and Maron, 1985]. Recall and precision can be defined also as follows. Let C be the whole collection of components forming the library. For each query, C can be partitioned into two disjoint sets, R, the set of relevant material and R the set of irrelevant material. Given the query, the system retrieves a set ....
Blair, D. and Maron, M. (1985). An evaluation of retrieval effectiveness for a full-text document retrieval system. Communications of the ACM, 28(3):289--299.
....how well a retrieval algorithm performed in terms of finding documents a user will deem relevant. It is well accepted that current retrieval systems exhibit mediocre effectiveness. A classic study examine real documents and real user requests and found that many relevant documents were not found [4]. Moreover, users have become complacent in their expectation of accuracy of information retrieval systems [9] Most IR systems focus on this problem because efficiency is considered secondary. A recent compendium of research papers considered to be of great import to the IR community, included ....
D.C. Blair and M.E. Maron. An evaluation of retrieval effectiveness for a full-text documentretrieval system. Communications of the ACM, 28(3):289--299, 1985.
....the microbiology literature. It is the non keyword terms in these documents that will determine whether or not they are relevant. Were you looking for information on computer communications or for information on protein synthesis ) The case is an extreme one, but the phenomenon is real and common [4]. We note that even within the relatively narrow domain of computers and communications, ATM is an ambiguous term which stands for both automatic teller machine and asynchronous transfer mode. Our aim now is to describe a particular computational approach for dealing with the document retrieval ....
David C. Blair and M. E. Maron. An evaluation of retrieval effectiveness for a fulltext document retrieval system. Communications of the ACM, 28(3 (MAR)):289-- 299, 1985.
....step of the extraction process. More specifically, much extraction research concerns linguistic ambiguity and query formulation. The former is challenging, in part, because people rarely use the same keyword to describe an object [4] leading to retrieval that misses up to 80 of relevant objects [7]. From an ICAM viewpoint, technological issues in extraction primarily concern the integration of information retrieval with other technologies. For example, the user interface is an important component of extraction activities. Scientific visualization techniques [16, 26, 29, 30] allow a user to, ....
David C. Blair and M. E. Maron. An evaluation of retrieval effectiveness for a full-text documentretrieval system. Communications of the ACM, 28(3):289--299, March 1985.
....feasible. And what about non commercial or non profit documents Though subject cataloging is not without its problems (Larson 1991) association based automatic indexing would provide an enormous cost effective improvement over the limitations of the full text boolean keyword search environment (Blair Maron 1985; Blair Maron 1990) which currently dominates network information services. Human indexing of network resources, prohibited by overwhelming time and cost factors, is unrealistic. Simple word frequency applications to classification do not attempt to account for the historical evidence provided ....
Blair, D. C. & M. E. Maron (1985). Evaluation of retrieval effectiveness for a full-text document-retrieval system. Communications of the ACM, 28(3):289--299.
....also be useful in traditional batch query processing, because it can serve as a form of pipelined, approximate sorting. Keywords: Online Reordering, Informix, Interactive Data Processing, User Control 1 Introduction It has often been noted that information analysis tools should be interactive [Blair and Maron 1985; Bates 1979; Bates 1990] since the data exploration tasks they enable are often only loosely specified. Information seekers work in an iterative fashion, starting with broad queries and continually refining them based on feedback and domain knowledge (see [O day and Jeffries 1993] for a user ....
Blair, D. and Maron, M. An evaluation of retrieval effectiveness for a full-text document retrieval system. Communications of the ACM, 28(3), 1985.
No context found.
Blair, David C., and Maron, M. E. "An Evaluation of Retrieval Effectiveness for a Full-Text DocumentRetrieval System," Communications of the ACM (28:3), March 1985, pp. 289-299.
No context found.
Blair, D.C., and Maron, M.E. (1985). An evaluation of retrieval effectiveness for a full-text document-retrieval system. Communications of the ACM, 28(3), 289299.
No context found.
D. Blair and M. Maron. An evaluation of retrieval effectiveness for a full-text document retrieval system. CACM, 28(3), 1985.
No context found.
Blair, D.C. and Maron, M.E. (1985). An Evaluation of Retrieval Effectiveness for a Full-Text Document-Retrieval System. Communications of the ACM, 28, 3, 280-299.
No context found.
D.C. Blair and M.E. Maron, "An Evaluation of Retrieval Effectiveness for a Full-Text DocumentRetrieval System," Comm. ACM, vol. 28, no. 3, 1985, pp. 289-299.
No context found.
Blair, D., Maron, M., An evaluation of retrieval effectiveness for a full-text document retrieval system. Communications of the ACM, New York: ACM, 289-299, (1985).
No context found.
Blair, D.C. and Maron, M.E. An evaluation of retrieval effectiveness for a full-text document retrieval system. Commun. ACM 28, 3,289-299.
No context found.
D.C. Blair, & M.E. Maron. An evaluation of retrieval effectiveness for a full-text document-retrieval system. Communications of the ACM, 28(3): 289-299, 1985.
No context found.
Blair, D.C. and Maron, M.E. An evaluation of retrieval effectiveness for a full-text document-retrieval system. Communications of the ACM, 28(3),
No context found.
D. Blair and M. Maron. An evaluation of retrieval effectiveness for a full-text document retrieval system. Communications of the ACM, New York: ACM, 289-299. (1985)
No context found.
Blair, D. C., & Maron, M. E. (1985). An evaluation of retrieval effectiveness for a fulltext document retrieval system. Communications of the ACM, 28, 289-299.
First 50 documents Next 50
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC