See this document in CiteSeerX!

Joining Massive High-Dimensional Datasets (2003)  (Make Corrections)  (1 citation)
Tamer Kahveci, Christian Lang, Ambuj K Singh
ICDE



  Home/Search   Context   Related

 
View or download:
ucsb.edu/research/trcs/doc...200230.ps
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  ucsb.edu/research/t...index.shtml (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: We consider the problem of joining massive datasets. We propose two techniques for minimizing disk I/O cost of join operations for both spatial and sequence data. Our techniques optimize the available buffer space using a global view of the datasets. We build a boolean matrix on the pages of the given datasets using a lower bounding distance predictor. The marked entries of this matrix represent candidate page pairs to be joined. Our first technique joins the marked pages iteratively. Our... (Update)

Context of citations to this paper:   More

...entries, r marked rows and c marked columns, then pm NLJ performs at least e rain r, c disk I Os for that cluster. Proof: omitted (cf. [24]) For the example in Figure 3, r = 3, c = 2, and e = 5. The total number of disk I Os is 5 rain 2, 3 = 7. Note that NLJ is the same...

Cited by:   More
Joining Massive High-Dimensional Datasets - Kahveci, Lang, Singh (2003)   (Correct)

Active bibliography (related documents):   More   All
0.6:   Optimizing Similarity Search for Arbitrary Length Time Series.. - Kahveci, Singh (2003)   (Correct)
0.6:   R-Trees Have Grown Everywhere - Manolopoulos, Nanopoulos..   (Correct)
0.5:   GORDER: An Efficient Method for KNN Join Processing - Chenyi Xia Hongjun (2004)   (Correct)

Similar documents based on text:   More   All
0.3:   Variable Length Queries for Time Series Data - Kahveci, Singh (2001)   (Correct)
0.3:   An Efficient Index Structure for String Databases - Kahveci, Singh (2001)   (Correct)
0.3:   Shift and Scale Invariant Search of Multi-attribute Time.. - Kahveci, Singh, Gurel (2001)   (Correct)

BibTeX entry:   (Update)

T. Kahveci, C. A. Lang, and A. K. Singh. Joining massive highdimensional datasets. Technical Report 30, UCSB, 2002. http://citeseer.ist.psu.edu/article/kahveci03joining.html   More

@inproceedings{ kahveci-joining,
    author = "Tamer Kahveci and Christian Lang and Ambuj K Singh",
    title = "Variable Length Queries for Time Series Data",
    booktitle = "{ICDE}",
    year = "2003",
    url = "citeseer.ist.psu.edu/article/kahveci03joining.html",
    url = "citeseer.nj.nec.com/kahveci03joining.html" }
Citations (may not include all citations):
4212   Computers and intractability A guide to the theory of NP-Com.. (context) - Garey, Jhonson - 1979
516   tree: An efficient and robust access method for points and r.. (context) - Beckmann, Kriegel et al. - 1990
241   Fast subsequence matching in time-series databases - Faloutsos, Ranganathan et al. - 1994
205   Efficient similarity search in sequence databases - Agrawal, Faloutsos et al. - 1993
159   Efficient processing of spatial joins using R-trees (context) - Brinkhoff, Kriegel et al. - 1993
134   Spatial query processing in an objectoriented database syste.. (context) - Orenstein - 1986
126   Fast similarity search in the presence of noise (context) - Agrawal, Lin et al. - 1995
122   Database Management Systems (context) - Ramakrishnan, Gehrke - 2000
115   Partition based spatial-merge join - Patel, DeWitt - 1996
89   Multi-step processing of spatial joins (context) - Brinkhoff, Kriegel et al. - 1994
87   Similarity-based queries for time series data - Rafiei, Mendelzon - 1997
81   Spatial hash-joins - Lo, Ravishankar - 1996
70   Optimal aggregation algorithms for middleware - Fagin, Lotem et al. - 2001
68   Spatial joins using seeded trees (context) - Lo, Ravishankar - 1994
61   Efficient time series matching by wavelets - Chan, Fu - 1999
61   Scalable sweeping-based spatial join - Arge, Procopiuc et al. - 1998
55   Spatial joins using R-trees: Breadth-first traversal with gl.. - Huang, Jing et al. - 1997
49   Size separation spatial join - Koudas, Sevcik - 1997
43   Dimensionalityreduction for similarity searching in dynamic .. - Kanth, Agrawal et al. - 1998
41   Incremental distance join algorithms for spatial databases - Hjaltason, Samet - 1998
37   Storage and access in relational data bases (context) - Blasgen, Eswaran - 1977
27   Matching and indexing sequences of different lengths - Bozkaya, Yazdani et al. - 1997
27   Fast time-series searching with scaling and shifting (context) - Chu, Wong - 1999
25   High-dimensional similarity joins - Shim, Srikant et al. - 2002
24   Variable length queries for time series data - Kahveci, Singh - 2001
21   A performance evaluation of spatial jo (context) - Papadopoulos, Rigaux et al. - 1999
20   Join algorithm costs revisited - Harris, Ramamohanarao - 1996
19   High dimensional similarity joins: algorithms and performanc.. - Koudas, Sevcik - 2000
19   Sort-merge-join: An idea whose time has (context) - Graefe - 1994
18   An efficient index structure for string databases - Kahveci, Singh - 2001
15   Approximate nearest neighbors and sequence comparison with b.. - Muthukrishnan, Sahinalp - 2000
15   An analysis of schedules for performing multipage requests - Seeger - 1996
13   Reading a set of disk pages (context) - Seeger, Larson et al. - 1993
13   TSA-tree: A waveletbased approach to improve the efficieny o.. - Shahabi, Tian et al. - 2000
8   Dissecting CPU and memory optimization effects (context) - Manegold, Boncz et al. - 2000
8   A cost model and index architecture for the similarity join - Bohm, Kriegel - 2001
7   Efficient scheduling of page access in index-based jo (context) - Chan, OOi - 1997
6   Epsilon grid order: An algorithm for the similarity join on .. - Bohm, Braunmuller et al. - 2001
6   BLOCKS database and its applications (context) - Henikoff, Henikoff - 1996
4   Similarity searching for multi-attribute sequences - Kahveci, Singh et al. - 2002
3   On sort-merge algorithm for band joins (context) - Lu, Tan - 1995
3   GESS: a scalable similarityjoin algorithm for mining large d.. (context) - Dittrich, Seeger - 2001
2   Clustering non-uniform-sized spatial objects to reduce i/o c.. (context) - Xiao, Zhang et al. - 2001
2   sortsweep algorithm new method R tree based spatial join (context) - Rigaux, sweep et al. - 2000
1   Join: an easy-to-use generic algorithm for efficiently proce.. (context) - Bercken, Schneider et al.

Documents on the same site (http://www.cs.ucsb.edu/research/trcs/index.shtml):   More
STATL Definition - Eckmann, Vigna, Kemmerer (2000)   (Correct)
A Comparison of Feedback Based and Fair Queuing Mechanisms.. - Iancu, Acharya (2001)   (Correct)
An Evaluation of Search Tree Techniques In The Presence of Caches - Iancu, Acharya (2001)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC