| ROBERTSON, S. E., VAN RIJSBERGEN, C. J., AND PORTER, M. F. 1981. Probabilistic models of indexing and searching. In Information Retrieval Research, R. N. Oddy et al., Eds. Butterworths, 35--56. |
....in particular to the indexing paradigm) The recognition problem causes one to regard two independent models: one with respect to retrieval and one with respect ro analysis of abstracts. This point of view is important for an approach to optimal indexing [6 ] but it is not. self obvious. In [14] the retrieval oriented approach of Robertson and the indexing oriented approach of Harter [13] are brought together. The result is a one model approach like also other approaches in this field (for example [15] The internal model M I is restricted to the base of the decision to be made. This ....
Robertson, S.E., van Rijsbergen, C.J., Porter, M.F., Probabilistic models of indexing and searching, in Oddy, R.N., Robertson, S.E., van Rijsbergen, G.J., Williams,P.W. (ed.), Information Retrieval Research,(Butterworth,London,1981).
.... problem by regarding document components as units to which the index term weights relate to; however, experimental evaluations showed that this model is inferior to non probabilistic indexing approaches [19] A different model for using probabilistic indexing weights in retrieval is described in [26] as the 2 Poisson Independence model, but also had little success (mainly because of parameter estimation problems) In contrast to these results, the approaches developed in [6] 7] 35] show improvements over binary indexing; however, these models lack an explicit notion of an event to which ....
S.E. Robertson, C.J. Van Rijsbergen, and M.F. Porter. Probabilistic models of indexing and searching. In: R.N. Oddy, S.E. Robertson, C.J. Van Rijsbergen, and P.W. Williams, editors: Information Retrieval Research, pages 35--56. Butterworths, London, 1981.
....refer the reader to the above references, with the suggestion that the book by Mosteller and Wallace [40] is the most clear treatment from a classification standpoint. Despite considerable study, explicit use of Poisson mixtures for text retrieval have not proven more effective than using the BIM [35,42]. This failure has been variously blamed on the larger number of parameters these models require estimating, the choice of estimation methods, the difficulty of accounting for document length in these models, and the poor fit of the models to actual term frequencies. In contrast, a recently ....
S. E. Robertson, C. J. van Rijsbergen, and M. F. Porter. Probabilistic models of indexing and searching. In R. N. Oddy, S. E. Robertson, C. J. van Rijsbergen, and P. W. Williams, editors, Information Research and Retrieval, chapter 4, pages 35-56. Butterworths, 1981.
....model that has proven to be most e ective; however, the eld has progressed in two di erent ways. On the one hand, theoretical studies of an underlying model have been developed; this direction is, for example, represented by the various kinds of logic models and probabilistic models (e.g. [14, 3, 15, 22]) On the other hand, there have been many empirical studies of models, including many variants of the vector space model (e.g. 17, 18, 19] In some cases, there have been theoretically motivated models that also perform well empirically; for example, the BM25 retrieval function, motivated by ....
S. E. Robertson, C. J. van-Rijsbergen, and M. F. Porter (1981). \Probabilistic models of indexing and searching", in Oddy R. N. et al. (Eds.) Information Retrieval Research, Butterworths, London, 1981, pp. 35-56.
....refer the reader to the above references, with the suggestion that the book by Mosteller and Wallace [40] is the most clear treatment from a classification standpoint. Despite considerable study, explicit use of Poisson mixtures for text retrieval have not proven more effective than using the BIM [35, 42]. This failure has been variously blamed on the larger number of parameters these models require estimating, the choice of estimation methods, the difficulty of accounting for document length in these models, and the poor fit of the models to actual term frequencies. In contrast, a recently ....
S. E. Robertson, C. J. van Rijsbergen, and M. F. Porter. Probabilistic models of indexing and searching. In R. N. Oddy, S. E. Robertson, C. J. van Rijsbergen, and P. W. Williams, editors, Information Research and Retrieval, chapter 4, pages 35--56. Butterworths, 1981.
....term cooccurrence has been used as a query expansion technique. The general approach has been to expand a user s submitted query with synonyms which have been found to co occur with the terms actually submitted. Overall, this technique has met with mixed results (Lesk 1969; Sparck Jones 1971; Robertson et al. 1981; Salton 1986; Peat Willett 1991) Another area, which has also received a good deal of attention, though only sporadically from the perspective of information retrieval, is that of lexical collocation. A lexical collocation, defined broadly, is an arbitrary and recurrent word combination ....
Robertson, S., C. V. Rijsbergen, & M. Porter (1981). Probabilistic models of indexing and searching. In R. N. Oddy, S. Robertson, C. V. Rijsbergen, & P. Williams, editors, Information Retrieval Research, pages 35--56. Butterworths, London.
.... Retrieval Models Models of information retrieval systems usually suggest that documents be assigned a retrieval status value by which documents may be ranked, with the highest ranked document presented to the searcher first, followed by the presentation of documents of expected lower value [1, 15, 18]. One specific model of retrieval is the probabilistic model, which uses the following retrieval rule: A document should be retrieved if the expected cost of retrieving the document ( is less than the expected cost of not retrieving the document ( More formally, a document should be ....
....to retrieve a document may now be transformed to: Retrieve a document with characteristics if and only if where the right hand side of this expression is a cost constant that must be exceeded if retrieval is to occur. Documents may then be ranked by the value of the left hand side of this formula [15, 14]. This value may be estimated as . 3 Term Independence If the features are assumed to be statistically independent in the document, that is, and are independent in both the set of relevant and the set of non relevant documents, then the weight for a document may be computed as Removing the ....
Stephen E. Robertson, C. J. Van Rijsbergen, and M.F. Porter. Probabilistic models of indexing and searching. In Robert Oddy, S. E. Robertson, C. J. van Rijsbergen, and P. W. Williams, editors, Information Retrieval Research, pages 35--56, London, 1981. Butterworths.
....term cooccurrence has been used as a query expansion technique. The general approach has been to expand a user s submitted query with synonyms which have been found to co occur with the terms actually submitted. Overall, this technique has met with mixed results (Lesk 1969; Sparck Jones 1971; Robertson et al. 1981; Salton 1986; Peat Willett 1991) Another area, which has also received a good deal of attention, though only sporadically from the perspective of information retrieval, is that of lexical collocation. A lexical collocation, defined broadly, is an arbitrary and recurrent word combination ....
Robertson, S., C. V. Rijsbergen, & M. Porter (1981). Probabilistic models of indexing and searching. In R. N. Oddy, S. Robertson, C. V. Rijsbergen, & P. Williams, editors, Information Retrieval Research, pages 35--56. Butterworths, London.
....problem of automatic indexing, that is, identifying a set of terms by which a document could be represented. Documents were assumed to be of uniform length, and the theory was developed using Poisson means rather than Poisson rates, where the mean is the rate times document length. The TPI model [6] is a combination of the binary independence model and the 2 Poisson model. A previous approach which extended the binary independence model to cover multiple term occurrences is described in [11] and [12] Another approach in which text is generated by a stochastic process is [4] which employs a ....
Robertson, S. E., C. J. van Rijsbergen, M. F. Porter. Probabilistic models of indexing and searching. In Oddy, R. N., et al., Information Retrieval Research pp 35-56, Butterworths. 1981
....The net contribution of the term is the (weighted) difference between those two values, which is guaranteed to be low. This property is important since random chance dictates much of the occurrence characteristics of low frequency terms. This can be compared with the probabilistic approaches[8, 9] in which the final contribution of a term is a ratio between a value due to relevant documents and a value due to non relevant documents (or more precisely, the log of a ratio) The values are lower for the more infrequent terms, but since a ratio is being computed, if the relevant value and ....
S.E. Robertson, C.J. van Rijsbergen, and M.F. Porter. Probabilistic models of indexing and searching. In R.N. Oddy, S.E. Robertson, C.J. van Rijsbergen, and P.W. Williams, editors, Information Retrieval Research. Butterworths, London, 1981.
....assumptions will give us formulae for P (TF jE) and P (TF jE) in terms of each of the two Poisson means. The same formulae will cover the case TF = 0, i.e. the term is absent. We can also define the probabilities for eliteness given likedness, namely P (EjL) and P (EjL) The basic assumption (Robertson, van Rijsbergen and Porter 1981) is now that TF depends directly on eliteness only, so that the relationship between TF and likedness is through eliteness. This relationship is expressed by means of two equations, one involving L: P (TF jL) P (TF jE)P (EjL) P (TF jE)P (EjL) and a second, similar one involving P (TF jL) ....
Robertson, S.E., van Rijsbergen, C.J. and Porter, M.F. (1981) Probabilistic models of indexing and searching. In Information retrieval research (Ed. W.R. Oddy et al.). London: Butterworths, 35-65.
....formulae, and it seems clear that it can contribute to better retrieval performance. However, there is no obvious reason why any particular function of tf should be used in retrieval. There is not much in the way of formal models which include a tf component; one which does is the 2 Poisson model [7, 8]. The 2 Poisson model postulates that the distribution of within document frequencies of a content bearing term is a mixture of two Poisson distributions: one set of documents (the elite set for the particular term, which may be interpreted to mean those documents which can be said to be ....
.... may be interpreted to mean those documents which can be said to be about the concept represented by the term) will exhibit a Poisson distribution of a certain mean, while the remainder may also contain the term but much less frequently (a smaller Poisson mean) Some earlier work in this area [8] attempted to use an exact formula derived from the model, but had limited success, probably partly because of the problem of estimating the required quantities. The approach here is to use the behaviour of the exact formula to suggest a very much simpler function of tf which behaves in a similar ....
Robertson, S.E, Van Rijsbergen, C.J. & Porter, M.F. Probabilistic models of indexing and searching. In Oddy, R.N. et al. (Eds.), Information Retrieval Research (pp.35--56). London: Butterworths, 1981.
No context found.
ROBERTSON, S. E., VAN RIJSBERGEN, C. J., AND PORTER, M. F. 1981. Probabilistic models of indexing and searching. In Information Retrieval Research, R. N. Oddy et al., Eds. Butterworths, 35--56.
No context found.
Stephen E. Robertson, C. J. Van Rijsbergen, and M.F. Porter. Probabilistic models of indexing and searching. In Robert Oddy, S. E. Robertson, C. J. van Rijsbergen, and P. W. Williams, editors, Information Retrieval Research, pages 35--56, London, 1981. Butterworths.
No context found.
Robertson, S. E., van Rijsbergen, C. J., & Porter, M. F. (1981). Probabilistic models of indexing and searching. In Proceedings of the 3rd annual acm conference on research and development in information retrieval (pp. 35--56). Butterworth & Co.
No context found.
ROBERTSON, S. E., VAN RIJSBERGEN, C. J., AND PORTER, M. F. 1981. Probabilistic models of indexing and searching. In Information Retrieval Research, R. N. Oddy et al., Eds. Butterworths, 35--56.
No context found.
Robertson, S. E., van Rijsbergen, C. J., and F.Porter, M. (1981). Probabilistic models of indexing and searching. In et al., O. R. N., editor, Information Retrieval Research, pages 35--56. Butterworths.
No context found.
Robertson, S. E., van Rijsbergen, C. J., and F.Porter, M. (1981). Probabilistic models of indexing and searching. In et al., O. R. N., editor, Information Retrieval Research, pages 35--56. Butterworths.
No context found.
Robertson, S. E., van Rijsbergen, C.J., and Porter, M. F., Probabilistic models of indexing and searching, Proceedings of SIGIR, 1980.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC