Results 1 - 10
of
95
Balancing Push and Pull for Data Broadcast
"... The increasing ability to interconnect computers through internetworking,wireless networks, high-bandwidth satellite, and cable networks has spawned a new class of information-centered applications based on data dissemination. These applications employ broadcast to deliver data to very large client ..."
Abstract
-
Cited by 189 (7 self)
- Add to MetaCart
The increasing ability to interconnect computers through internetworking,wireless networks, high-bandwidth satellite, and cable networks has spawned a new class of information-centered applications based on data dissemination. These applications employ broadcast to deliver data to very large client populations. We have proposed the Broadcast Disks paradigm [Zdon94, Acha95b] for organizing the contents of a data broadcast program and for managing client resources in response to such a program. Our previous workon Broadcast Disks focused exclusively on the "push-based" approach, where data is sent out on the broadcast channel according to a periodic schedule, in anticipation of client requests. In this paper, we study how to augment the push-only model with a "pull-based" approach of using a backchannel to allow clients to send explicit requests for data to the server. We analyze the scalability and performance of a broadcast-based system that integrates push and pull and study the impac...
Voronoi Diagrams and Delaunay Triangulations
- Computing in Euclidean Geometry
, 1992
"... The Voronoi diagram is a fundamental structure in computationalgeometry and arises naturally in many different fields. This chapter surveys properties of the Voronoi diagram and its geometric dual, the Delaunay triangulation. The emphasis is on practical algorithms for the construction of Voronoi ..."
Abstract
-
Cited by 175 (3 self)
- Add to MetaCart
The Voronoi diagram is a fundamental structure in computationalgeometry and arises naturally in many different fields. This chapter surveys properties of the Voronoi diagram and its geometric dual, the Delaunay triangulation. The emphasis is on practical algorithms for the construction of Voronoi diagrams. 1 Introduction Let S be a set of n points in d-dimensional euclidean space E d . The points of S are called sites. The Voronoi diagram of S splits E d into regions with one region for each site, so that the points in the region for site s2S are closer to s than to any other site in S. The Delaunay triangulation of S is the unique triangulation of S so that there are no elements of S inside the circumsphere of any triangle. Here `triangulation' is extended from the planar usage to arbitrary dimension: a triangulation decomposes the convex hull of S into simplices using elements of S as vertices. The existence and uniqueness of the Delaunay triangulation are perhaps not obvio...
Prefetching from a broadcast disk
- In Proceedings of ICDE'96: The 1996 International Conference on Data Engineering
, 1996
"... Broadcast Disks have been proposed as a means to efficiently deliver data to clients in “asymmetric ” environments where the available bandwidth from the server to the clients greatly exceeds the bandwidth in the opposite direction. A previous study investigated the use of cost-based caching to impr ..."
Abstract
-
Cited by 89 (10 self)
- Add to MetaCart
Broadcast Disks have been proposed as a means to efficiently deliver data to clients in “asymmetric ” environments where the available bandwidth from the server to the clients greatly exceeds the bandwidth in the opposite direction. A previous study investigated the use of cost-based caching to improve performance when clients access the broadcast in a demand-driven manner [AAF95]. Such demand-driven access however, does not fully exploit the dissemination-based nature of the broadcast, which is particularly conducive to client prefetching. With a Broadcast Disk, pages continually flow past the clients so that, in contrast to traditional environments, prefetching can be performed without placing additional load on shared resources. We argue for the use of a simple prefetch heuristic called ¢¡ and show that ¢ ¡ balances the cache residency time of a data item with its bandwidth allocation. Because of this tradeoff, ¢¡ is very tolerant of variations in the broadcast program. We describe an implementable approximation for ¢¡
Robust Image Hashing
, 2000
"... The proliferation of digital images creates problems for managing large image databases, indexing individual images, and protecting intellectual property. This paper introduces a novel image indexing technique that may be called an image hash function. The algorithm uses randomized signal processing ..."
Abstract
-
Cited by 68 (8 self)
- Add to MetaCart
The proliferation of digital images creates problems for managing large image databases, indexing individual images, and protecting intellectual property. This paper introduces a novel image indexing technique that may be called an image hash function. The algorithm uses randomized signal processing strategies for a non-reversible compression of images into random binary strings, and is shown to be robust against image changes due to compression, geometric distortions, and other attacks.
Disseminating Updates on Broadcast Disks
, 1996
"... Lately there has been increasing interest in the use of data dissemination as a means for delivering data from servers to clients in both wired and wireless environments. Using data dissemination, the transfer of data is initiated by servers, resulting in a reversal of the traditional relationship b ..."
Abstract
-
Cited by 65 (6 self)
- Add to MetaCart
Lately there has been increasing interest in the use of data dissemination as a means for delivering data from servers to clients in both wired and wireless environments. Using data dissemination, the transfer of data is initiated by servers, resulting in a reversal of the traditional relationship between clients and servers. In previous papers, we have proposed Broadcast Disks as a model for structuring the repetitive transmission of data in a broadcast medium. Broadcast Disks are intended for use in environments where, for either physical or application-dependent reasons, there is asymmetry in the communication capacity between clients and servers. Examples of such environments include wireless networks with mobile clients, cable and direct satellite broadcast, and information dispersal applications. Our initial studies of Broadcast Disks focused on the performance of the mechanism when the data being broadcast did not change. In this paper, we extend those results to incorporate the...
Optimal and Sublogarithmic Time Randomized Parallel Sorting Algorithms
- SIAM Journal on Computing
, 1989
"... .We assume a parallel RAM model which allows both concurrent reads and concurrent writes of a global memory. Our main result is an optimal randomized parallel algorithm for INTEGER SORT (i.e., for sorting n integers in the range [1; n]). Our algorithm costs only logarithmic time and is the first kno ..."
Abstract
-
Cited by 60 (12 self)
- Add to MetaCart
.We assume a parallel RAM model which allows both concurrent reads and concurrent writes of a global memory. Our main result is an optimal randomized parallel algorithm for INTEGER SORT (i.e., for sorting n integers in the range [1; n]). Our algorithm costs only logarithmic time and is the first known that is optimal: the product of its time and processor bounds is upper bounded by a linear function of the input size. We also give a deterministic sub-logarithmic time algorithm for prefix sum. In addition we present a sub-logarithmic time algorithm for obtaining a random permutation of n elements in parallel. And finally, we present sub-logarithmic time algorithms for GENERAL SORT and INTEGER SORT. Our sublogarithmic GENERAL SORT algorithm is also optimal. Key words. Randomized algorithms, parallel sorting, parallel random access machines, random permutations, radix sort, prefix sum, optimal algorithms. AMS(MOS) subject classifications. 68Q25. 1 A preliminary version of this paper ...
TAILOR: A Record Linkage Toolbox
, 2002
"... Data cleaning is a vital process that ensures the quality of data stored in real-world databases. Data cleaning problems are frequently encountered in many research areas, such as knowledge discovery in databases, data warehousing, system integration and e-services. The process of identifying the re ..."
Abstract
-
Cited by 56 (8 self)
- Add to MetaCart
Data cleaning is a vital process that ensures the quality of data stored in real-world databases. Data cleaning problems are frequently encountered in many research areas, such as knowledge discovery in databases, data warehousing, system integration and e-services. The process of identifying the record pairs that represent the same entity (duplicate records), commonly known as record linkage, is one of the essential elements of data cleaning. In this paper, we address the record linkage problem by adopting a machine learning approach. Three models are proposed and are analyzed empirically. Since no existing model, including those proposed in this paper, has been proved to be superior, we have developed an interactive Record Linkage Toolbox named TAILOR. Users of TAILOR can build their own record linkage models by tuning system parameters and by plugging in in-house developed and public domain tools. The proposed toolbox serves as a framework for the record linkage process, and is designed in an extensible way to interface with existing and future record linkage models. We have conducted an extensive experimental study to evaluate our proposed models using not only synthetic but also real data. Results show that the proposed machine learning record linkage models outperform the existing ones both in accuracy and in performance.
THE ACCURACY OF FLOATING POINT SUMMATION
, 1993
"... The usual recursive summation technique is just one of several ways of computing the sum of n floating point numbers. Five summation methods and their variations are analyzed here. The accuracy of the methods is compared using rounding error analysis and numerical experiments. Four ofthe methods are ..."
Abstract
-
Cited by 32 (0 self)
- Add to MetaCart
The usual recursive summation technique is just one of several ways of computing the sum of n floating point numbers. Five summation methods and their variations are analyzed here. The accuracy of the methods is compared using rounding error analysis and numerical experiments. Four ofthe methods are shown to be special cases of a general class of methods, and an error analysis is given for this class. No one method is uniformly more accurate than the others, but some guidelines are givenon the choice of method in particular cases.

