39 citations found. Retrieving documents...
Inderjit S. Dhillon and Dharmendra S. Modha. A data-clustering algorithm on distributed memory multiprocessors. In Large-Scale Parallel Data Mining, Lecture Notes in Artificial Intelligence, pages 245--260, 2000.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents

Parallel Algorithms For Clustering High-Dimensional.. - Nagesh, Goil, Choudhary   (Correct)

....the quality of clustering is heavily dependent on grid size and density threshold parameters. A survey of parallel algorithms for hierarchical clustering using distance based metrics is given in [Ols95] These are more theoretical PRAM algorithms. Recently, k means algorithm has been parallelized [DM99] but is limited, however, in its applicability, as it requires the user to specify k, the number of clusters, and also does not find clusters in subspaces. Clusters are unions of connected high density cells. Two k dimensional cells are connected if they have a common face in the k dimensional ....

I.S. Dhillon and D.S. Modha. A data-clustering algorithm on distributed memory multiprocessors. Large-Scale Parallel KDD Systems, ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1999.


Efficient Strategies for Partitioning and Querying.. - Codenotti, De.. (2003)   (Correct)

.... widely investigated problems in many fields (such as Machine Learning, Data Mining, Computational Geometry, and of course Information Retrieval) However, there is no clear indication as to whether or not existing algorithms could effectively be employed in large scale web applications (see, e.g. [4, 5], for a discussion of the difficulties connected to the efficient clustering of very large document collections and for sequential and distributed algorithms with state of the art performances) In this paper we isolate a problem, which we call Minimum Redirections Problem, related to the ....

I. S. Dhillon and D. S. Modha, A data clustering algorithm on distributed memory multiprocessing, Large-Scale Parallel Data Mining, Lecture Notes in Artificial Intelligence, Volume 1759, pp. 245-260, 2000.


Data Mining - Challenges, Models, Methods and Algorithms - Hegland (2003)   (Correct)

....fact, if any clusters are changed, J is reduced. As J is bounded from below it converges and as a consequence the algorithm converges. It is also known that the k means will always converge to a local minimum [17] The k means algorithm may be viewed as a variant of the EM algorithm [56] In [26] a parallel k means algorithm is proposed. The authors also provide a careful analysis of the algorithm s computational complexity. There are two Algorithm 7 k means algorithm Select k arbitrary data points z 1 , z k . repeat T i : z i ) z s ) s = 1, p z i ....

....Finding the minimum for each point requires at total of kN comparisons, then one needs to compute the new average for each cluster which requires nd additions and kd divisions. The cost is usually dominated in data mining by the costs for the determination of all the distances and thus the time is [26]: T = O(NkdI) where I is the number of iterations. For the parallel algorithm (shared nothing) the data is initially distributed over the discs of all the processors. Then each processor computes the distances of its elements to all cluster centers. This is done in parallel and so the most ....

I.S. Dhillon and D.S. Modha. A data-clustering algorithm on distributed memory multiprocessors. In Zaki and Ho [73].


A Parallel Learning Algorithm for Text Classification - Kruengkrai, Jaruskulchai (2002)   (1 citation)  (Correct)

....tree [8] Several researches study techniques for parallelizing clustering algorithms, which can be considered as the unsupervised learning problem. Ruocco and Frieder [15] propose parallel single link and single pass algorithm for clustering documents worked on an Intel Paragon. Dhillon and Modha [3] introduce an effective parallelization of the k means clustering algorithm implemented on an IBM POWERparallel SP2. Forman and Zhang [4] also present a general technique for parallelizing a class of center based clustering algorithms including k means, k harmonic means, and EM algorithm performed ....

Dhillon, I.S., and Modha, D.S. A data-clustering algorithm on distributed memory multiprocessors. Large-Scale Parallel Data Mining, pages 245-260, 1999.


Algorithms for Clustering High Dimensional and - Tao   (Correct)

No context found.

Inderjit S. Dhillon and Dharmendra S. Modha. A data-clustering algorithm on distributed memory multiprocessors. In Large-Scale Parallel Data Mining, Lecture Notes in Artificial Intelligence, pages 245--260, 2000.


PENS: An Algorithm for Density-Based Clustering in.. - Mei Li Guanling   (Correct)

No context found.

I. S. Dhillon and D. S. Modha. A data-clustering algorithm on distributed memory multiprocessors. In Proceedings of Workshop on Large-Scale Parallel KDD Systems (in conjunction with SIGKDD), pages 245--260, August 1999.


Distributed Clustering with Limited Knowledge Sharing - Ghosh, Merugu   (Correct)

No context found.

I. S. Dhillon and D. S. Modha. A data-clustering algorithm on distributed memory multiprocessors. In M. Zaki and C. Ho, editors, Large Scale Parallel Data Mining, pages 245--260. LNCS vol 1759. Springer, 2000.


A Probabilistic Approach to Privacy-sensitive Distributed.. - Srujana Merugu And (2003)   (1 citation)  (Correct)

No context found.

I. S. Dhillon and D. S. Modha. A data-clustering algorithm on distributed memory multiprocessors. In KDD, pages 245--260, 1999.


Privacy-preserving Distributed Clustering using Generative.. - Srujana Merugu And (2003)   (4 citations)  (Correct)

No context found.

I. S. Dhillon and D. S. Modha. A data-clustering algorithm on distributed memory multiprocessors. In ACM SIGKDD, 1999.


Shared Memory Parallelization of Data Mining Algorithms.. - Jin, Yang, Agrawal (2004)   (1 citation)  (Correct)

No context found.

I.S. Dhillon and D.S. Modha, "A Data-Clustering Algorithm on Distributed Memory Multiprocessors," Proc. Workshop Large-Scale Parallel KDD Systems, in conjunction with the Fifth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining (KDD '99), pp. 4756, Aug. 1999.


Privacy-Preserving K-Means Clustering over Vertically.. - Vaidya, Clifton (2003)   (2 citations)  (Correct)

No context found.

I. S. Dhillon and D. S. Modha. A data-clustering algorithm on distributed memory multiprocessors. In Proceedings of Large-scale Parallel KDD Systems Workshop, ACM SIGKDD, Aug. 15-18 1999.


Distributed Data Clustering Can Be - Efficient And Exact (2000)   (Correct)

No context found.

Dhillon, I.S. and Modha, D.S. "A data clustering algorithm on distributed memory machines," ACM SIGKDD Workshop on Large-Scale Parallel KDD Systems (with KDD99), August 1999.


Compiler and Middleware Support for Scalable Data Mining - Agrawal, Jin, Li   (Correct)

No context found.

Inderjit S. Dhillon and Dharmendra S. Modha. A data-clustering algorithm on distributed memory multiprocessors. In In Proceedings of Workshop on Large-Scale Parallel KDD Systems, in conjunction with the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 99), pages 47 -- 56, August 1999.


Speeding up k-means Clustering by Bootstrap Averaging - Ian Davidson And   (Correct)

No context found.

Dhillon, I. S. and Modha, D. M., A Data Clustering Algorithm on Distributed Memory Multiprocessors, in Large-Scale Parallel Data Mining, Lecture Notes in Artificial Intelligence, Volume 1759, pages 245260, 2000.


A Probabilistic Approach to Privacy-sensitive Distributed.. - Srujana Merugu And (2003)   (1 citation)  (Correct)

No context found.

I. S. Dhillon and D. S. Modha. A data-clustering algorithm on distributed memory multiprocessors. In KDD, pages 245--260, 1999.


Data Mining - Challenges, Models, Methods and Algorithms - Hegland (2003)   (Correct)

No context found.

I.S. Dhillon and D.S. Modha. A data-clustering algorithm on distributed memory multiprocessors. In Zaki and Ho [73].


Communication and Memory Efficient Parallel Decision Tree.. - Jin, Agrawal (2003)   (Correct)

No context found.

Inderjit S. Dhillon and Dharmendra S. Modha. A data-clustering algorithm on distributed memory multiprocessors. In In Proceedings of Workshop on Large-Scale Parallel KDD Systems, in conjunction with the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 99), pages 47 -- 56, August 1999.


Shared Memory Parallelization of Data Mining Algorithms.. - Jin, Agrawal (2002)   (1 citation)  (Correct)

No context found.

Inderjit S. Dhillon and Dharmendra S. Modha. A data-clustering algorithm on distributed memory multiprocessors. In In Proceedings of Workshop on Large-Scale Parallel KDD Systems, in conjunction with the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 99), pages 47 - 56, August 1999.


Distributed Data Mining Bibliography - Hillol   (Correct)

No context found.

I. Dhillon and D. Modha. A Data-clustering Algorithm on Distributed Memory Multiprocessors. In Proceedings of the KDD'99 Workshop on High Performance Knowledge Discovery, pages 245--260, 1999.


CONQUEST: A Distributed Tool for Constructing Summaries of.. - Chi, Koyuturk, Grama (2004)   (Correct)

No context found.

I. S. Dhillon and D. S. Modha. A data-clustering algorithm on distributed memory multiprocessors. In Large-Scale Parallel Data Mining, Lecture Notes in Artificial Intelligence, pages 245--260, 2000.


Towards an Open Service Architecture for Data Mining on.. - Brezany, Hofer, Tjoa.. (2003)   (Correct)

No context found.

I. S. Dhillon and D. S. Modha. A dataclustering algorithm on distributed memory multiprocesors. In M. J. Zaki and C.-T. Ho (eds), Large-Scale Parallel Data Mining, Springer-Verlag, LNCS 1759, pages 245-- 260, 1999.


Towards Effective and Efficient Distributed Clustering - Januzaj, Kriegel, Pfeifle (2003)   (Correct)

No context found.

Dhillon I. S., Modh Dh. S.: "A Data-Clustering Algorithm On Distributed Memory Multiprocessors", Int. Conf. on Knowledge Discovery and Data Mining (SIGKDD 99) 98] Ester M., Kriegel H.-P., Sander J., WimmerM.,XuX.: "Incremental Clustering for Mining in a Data Warehousing Environment", VLDB 98


Distributed Clustering Using Collective Principal.. - Kargupta, Huang.. (1999)   (11 citations)  (Correct)

No context found.

I. Dhillon and D. Modha. A data clustering algorithm on distributed memory multiprocessors. In Workshop on Large-Scale Parallel KDD Systems, 1999.


Unsupervised Distributed Clustering - Tasoulis, Vrahatis (2004)   (Correct)

No context found.

I.S. Dhillon and D.S. Modha. A data-clustering algorithm on distributed memory multiprocessors. In Large-Scale Parallel Data Mining, Lecture Notes in Artificial Intelligence, pages 245--260, 2000.


Scalable Clustering - Ghosh (2003)   (Correct)

No context found.

Dhillon, I. S. and Modha, D. S. (1999). A data-clustering algorithm on distributed memory multiprocessors. In Proc. Large-scale Parallel KDD Systems Workshop, ACM SIGKDD.

First 50 documents

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC