| Edward L. Wimmers, Laura M. Haas, Mary Tork Roth, Cristoph Braendli, "Using Fagin's Algorithm for Merging Ranked Results in Multimedia Middleware", Proceddings of 4 th IFCIS International Conference on Cooperative Information Systems, Edinburgh, SCOTLAND, September 1999. |
.... that it may appear straightforward to transform an expensive predicate into a normal one: By probing every object for its score, one can build a search index for the predicate (to access objects scored above a threshold or in the sorted order) as required by the current processing frameworks [4, 5, 6, 7, 8, 9, 10] (Section 3.3) This naive approach requires a sequential scan, or complete probing, over the entire database: A database of objects will need sequential probes for each expensive predicate. Such complete probing at query time is clearly unacceptable in most cases. This paper addresses ....
....processing expensive predicates efficiently. As Section 1 discussed, all current major DBMSs (e.g. Microsoft SQL Server, IBM DB2, Oracle, and PostgreSQL) support such predicates. Top k queries have been developed recently in two different contexts. First, in a middleware environment, Fagin [7, 8] pioneered ranked queries and established the well known algorithm (with its improvements in [9, 10] 12] generalizes to handling arbitrary joins as combining constraints. As Section 3 discusses, these works assume sorted access of search predicates. This paper thus studies probe ....
E. Wimmers, L. Haas, M. Roth, and C. Braendli. Using Fagin's algorithm for merging ranked results in multimedia middleware. International Conference on Cooperative Information Systems, pages 267--278, 1999.
....This work was originally motivated by queries to multimedia databases, e.g. to retrieve images. Stated in IR terms, the algorithms also assume that postings in the inverted lists are sorted by their contributions and are accessed in sorted order. However, several of the algorithms proposed in [15, 18, 19, 30, 39] also assume that once a document is encountered in one of the inverted lists, we can efficiently compute its complete score by performing lookups into the other inverted lists. This gives much better pruning than sorted access alone, but in a search engine context it may not be efficient as it ....
E. Wimmers, L. Haas, M. Roth, and C. Braendli. Using fagin's algorithm for merging ranked results in multimedia middleware. In Fourth IFCIS Int. Conf. on Cooperative Information Systems, pages 267--278, September 1999.
....his FA algorithm for a class of composition functions. Also, Fagin and Wimmers [17] discuss how to modify the scoring function to incorporate user preferences so that, say, an attribute might be twice as important to a user than the other attributes mentioned in the query. Finally, Wimmers et al. [37] describe their experience in implementing Fagin s original algorithm on Garlic. Fagin s algorithm was markedly more efficient in joining multiple multimedia sources compared to traditional join techniques. However, the paper also points to intrinsic difficulties arising from heterogeneity of ....
E.L. Wimmers, L. M. Haas, M. T. Roth, and C. Braendli. Using Fagin's algorithm for merging ranked results in multimedia middleware. In Proceedings of the Fourth IFCIS International Conference on Cooperative Information Systems (CoopIS 1999.
....attention to algorithms that do not make random accesses. We shall give the algorithm NRA shortly. There are other scenarios where random access is not impossible, but simply expensive. An example arises when the costs correspond to disk access (sequential versus random) Also, Wimmers et al. WHRB99] discuss a number of systems issues that can cause random access to be expensive. Then we would like the optimality ratio to be independent of the ratio c R c S (recall that c R is the cost of a single random access, and c S is the cost of a single sorted access) That is, if we allow c R and c ....
E. L. Wimmers, L. M. Haas, M. Tork Roth, and C. Braendli. Using Fagin's algorithm for merging ranked results in multimedia middleware. In Fourth IFCIS International Conference on Cooperative Information Systems, pages 267--278. IEEE Computer Society Press, September 1999.
....TA to obtain an algorithm NRA ( no random accesses ) that does no random accesses. We prove that NRA is instance optimal over all algorithms that do not make random accesses and over all databases. What about situations where random access is not impossible, but simply expensive Wimmers et al. WHRB99] discuss a number of systems issues that can cause random access to be expensive. Although TA is instance optimal, the optimality ratio depends on the ratio c R =c S of the cost of a single random access to the cost of a single sorted access. We define another algorithm that is a combination of ....
....different reflects the fact that the cost to a middleware system of a sorted access and of a random access may be different. 3 Fagin s Algorithm In this section, we discuss FA (Fagin s Algorithm) Fag99] This algorithm is implemented in Garlic 95] an experimental IBM middleware system; see [WHRB99] for interesting details about the implementation and performance in practice. Chaudhuri and Gravano [CG96] consider ways to simulate FA by using filter conditions , which might say, for example, that the color score is at least 0.2. FA works as follows. 1. Do sorted access in parallel to each ....
E. L. Wimmers, L. M. Haas, M. Tork Roth, and C. Braendli. Using Fagin's algorithm for merging ranked results in multimedia middleware. In Fourth IFCIS International Conference on Cooperative Information Systems, pages 267--278. IEEE Computer Society Press, September 1999.
....where random access is not forbidden, but simply expensive Wimmers et al. 5 Mihalis Yannakakis pointed out to us that in the case of, for example, online versus offline algorithms, we can circumvent the issue of whether B 2 A by simply taking A to be the class of online algorithms. 4 [WHRB99] discuss a number of systems issues that can cause random access to be expensive. Although TA is instance optimal, the optimality ratio depends on the ratio c R =c S of the cost of a single random access to the cost of a single sorted access. We define another algorithm that is a combination of ....
....reflects the fact that the cost to a middleware system of a sorted access and of a random access may be different. 3 Fagin s Algorithm In this section, we discuss FA (Fagin s Algorithm) Fag99] This algorithm is implemented in Garlic [CHS 95] an experimental IBM middleware system; see [WHRB99] for interesting details about the implementation and performance in practice. Chaudhuri and Gravano [CG96] consider ways to simulate FA by using filter conditions , which might say, for example, that the color score is at least 0.2. FA works as follows. 1. Do sorted access in parallel to each ....
E. L. Wimmers, L. M. Haas, M. Tork Roth, and C. Braendli. Using Fagin's algorithm for merging ranked results in multimedia middleware. In Fourth IFCIS International Conference on Cooperative Information Systems, pages 267--278. IEEE Computer Society Press, September 1999.
....TA to obtain an algorithm NRA ( no random accesses ) that does no random accesses. We prove that NRA is instance optimal over all algorithms that do not make random accesses and over all databases. What about situations where random access is not forbidden, but simply expensive Wimmers et al. WHRB99] discuss a number of systems issues that can cause random access to be expensive. Although TA is instance optimal, the optimality ratio depends on the ratio c R =c S of the cost of a single random access to the cost of a single sorted access. We define another algorithm that is a combination of ....
....reflects the fact that the cost to a middleware system of a sorted access and of a random access may be different. 3 Fagin s Algorithm In this section, we discuss FA (Fagin s Algorithm) Fag99] This algorithm is implemented in Garlic [CHS 95] an experimental IBM middleware system; see [WHRB99] for interesting details about the implementation and performance in practice. FA works as follows. 1. Do sorted access in parallel to each of the m sorted lists L i . By in parallel , we mean that we access the top member of each of the lists under sorted access, then we access the second ....
E. L. Wimmers, L. M. Haas, M. Tork Roth, and C. Braendli. Using Fagin's algorithm for merging ranked results in multimedia middleware. In Fourth IFCIS International Conference on Cooperative Information Systems, pages 267--278. IEEE Computer Society Press, September 1999.
....Then the resulting multimedia objects or documents can be retrieved from the database and delivered via the Internet. However, existing combining algorithms are designed for homogeneous environments and tend to deteriorate to complexities worse than the linear scan for heterogeneous environments [5]. Figure 1: Architecture of the HERON system Previous approaches in the field of multi feature query combination can generally be divided in statistical approaches [9] and those with guaranteed correctness [4, 10] Whereas statistical approaches assume certain score distributions in the output ....
....ordered by their ranks, if bulk access is feasible) In contrast, a random access is performed, if the specific score value for an already seen object is retrieved from any classifier. However, in heterogeneous environments doing random accesses may be very expensive or sometimes even impossible [5, 6]. The algorithms mentioned above may differ in the exact number of random accesses, but all of them need a considerable number of random accesses before termination. In the following we will present a new algorithm retrieving the top k objects by combining n different atomic output streams without ....
Wimmers, Haas, Tork-Roth, Brndli. Using Fagin's Algorithm for Merging Ranked Results in Multimedia Middleware. In: Proc of the Intern. Conf. on Cooperating Information Systems COOPIS'99, pp. 267-278, Edinburgh, Great Britain, 1999
....on an early version of Garlic, and found that although the algorithm is simple, there are many implementation issues that need to be addressed. Braendli did a more extensive implementation in a later version, as a part of an broad study carried out by Wimmers, Haas, Tork Roth, and Braendli [WHTB98], which considered implementation and performance issues. We discuss some of these issues in this section. The performance of A 0 , as measured in both [Pr95] and [WHTB98] is consistent with the theoretical analysis. Furthermore, in [WHTB98] they say: We have seen that Fagin s algorithm behaves ....
....extensive implementation in a later version, as a part of an broad study carried out by Wimmers, Haas, Tork Roth, and Braendli [WHTB98] which considered implementation and performance issues. We discuss some of these issues in this section. The performance of A 0 , as measured in both [Pr95] and [WHTB98], is consistent with the theoretical analysis. Furthermore, in [WHTB98] they say: We have seen that Fagin s algorithm behaves well for a broad range of queries, and a broad range of access costs. The issues and concerns were the applicability of the algorithm in practice, as we now discuss. One ....
[Article contains additional citation context not shown here]
E. L. Wimmers, L. M. Haas, M. Tork Roth, and C. Braendli, Using Fagin's Algorithm for Merging Ranked Results in Multimedia Middleware, to appear (1998).
....natural assumptions that lead to efficient algorithms in cases of interest. 10 Related work Chaudhuri and Gravano [CG96] consider ways to simulate algorithm A 0 by using filter conditions , which might say, for example, that the color score is at least 0.2. Wimmers, Haas, Tork Roth, and Braendli [WHTB98] carry out detailed studies on the performance of algorithm A 0 and consider implementation issues (see the author s paper [Fa98] for a discussion) 11 Conclusions We have presented a semantics for Garlic that allows us to combine information from different subsystems in a natural way. ....
E. L. Wimmers, L. M. Haas, M. Tork Roth, and C. Braendli, Using Fagin's Algorithm for Merging Ranked Results in Multimedia Middleware, in preparation (1998).
....use FA. After that point, PJ is the better choice) In both of these figures, we have also plotted the total amount of time spent on random and sorted accesses. We can see that the amount of time spent doing random accesses is much less for the case where there is some correlation in the scores. WHRB98] shows this even more dramatically for an almost perfectly correlated search. Note that, even for uncorrelated streams, FA handily beats out the competing pointer join up until k = 190 a full 35 of the data This is impressive for an algorithm that has as its goal to allow the user to ....
....addition to a middleware engine. Unfortunately, the algorithm as it stands has limited applicability, and this, combined with the difficulty of testing whether it is applicable, make the cost of the algorithm to query compilation more than it is worth for a general purpose middleware. In [WHRB98] we examine several variants of the algorithm that might increase its applicability. One variant is fully general, allowing the algorithm to be used for any query; however, the applicability comes at the expense of losing all guarantees about performance. A potentially more interesting variation ....
E. Wimmers, L. Haas, M. Tork Roth, and C. Braendli. Using Fagin's algorithm for merging ranked results in multimedia middleware, August 1998. IBM RJ Number 95005. 24
....use FA. After that point, PJ is the better choice) In both of these figures, we have also plotted the total amount of time spent on random and sorted accesses. We can see that the amount of time spent doing random accesses is much less for the case where there is some correlation in the scores. [18] shows this even more dramatically for an almost perfectly correlated search. Note that, even for uncorrelated streams, FA handily beats out the competing pointer join up until k = 190 a full 35 of the data This is impressive for an algorithm that has as its goal to allow the user to ....
....important addition to a middleware engine. Unfortunately, the algorithm as it stands has limited applicability, and this, combined with the difficulty of testing whether it is applicable, make the cost of the algorithm to query compilation more than it is worth for a generalpurpose middleware. In [18], we examine several variants of the algorithm that might increase its applicability. One variant is fully general, allowing the algorithm to be used for any query; however, the applicability comes at the expense of losing all guarantees about performance. A potentially more interesting variation ....
E. Wimmers, L. Haas, M. T. Roth, and C. Braendli. Using Fagin's algorithm for merging ranked results in multimedia middleware, Aug. 1998. IBM RJ Number 95005.
No context found.
Edward L. Wimmers, Laura M. Haas, Mary Tork Roth, Cristoph Braendli, "Using Fagin's Algorithm for Merging Ranked Results in Multimedia Middleware", Proceddings of 4 th IFCIS International Conference on Cooperative Information Systems, Edinburgh, SCOTLAND, September 1999.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC