11 citations found. Retrieving documents...
S. Goil, et al. 1999. MAFIA: efficient and scalable subspace clustering for very large data sets. Northwestern University, Technical Report CPDC-TR-9906-010.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
QROCK: A Quick Version of the ROCK Algorithm for.. - Dutta, Mahanta, Pujari   (Correct)

....fit in main memory, it is more relevant to investigate clustering algorithms meeting the specific requirement of minimizing the I O operations. Some of the major clustering algorithms proposed in the context of data mining are BIRCH[14] CURE[10] PAM[12] CLARANS[12] DBSCAN[4] BUBBLE[7] MAFIA [8], ITERATE , CHAMELON[11] etc. It is to be noted that the basic principle of clustering hinges on a concept of distance metric or similarity metric. Thus the clustering techniques that are designed mostly for numeric data, exploit the inherent geometric properties based on some priori structure ....

. S. Goil, Harsha and Alok Choudhary. MAFIA: Efficient and scalable subspace clustering for very large data sets. Submitted for publication to ICDE, 2000


Survey Of Clustering Data Mining Techniques - Berkhin (2002)   (18 citations)  (Correct)

....is considered good for clustering. Any subspace of a good subspace is also good, since = I Low entropy subspace corresponds to a skewed distribution of unit densities. The computational costs of ENCLUS are significant. The algorithm MAFIA (Merging of Adaptive Finite intervals) by Goil et al. [GNC99], NGC01] significantly modifies CLIQUE. it starts with one data pass to construct adaptive grids in each dimension. Many (1000) bins are used to compute histograms by adding blocks of data in core, which are then merged together to come up with a smaller number of variable size bins than CLIQUE ....

Goil, S., Nagesh, H., and Choudhary, A. MAFIA: Efficient and scalable subspace clustering for very large data sets. Technical Report CPDC-TR-9906-010, Northwestern University, 1999.


Hypergraph Models and Algorithms for Data-Pattern Based.. - Ozdal, Aykanat (2001)   (1 citation)  (Correct)

....to obtain features that are relevant to all clusters. As a remedy for this, Agrawal et al. 1998) give a scalable algorithm CLIQUE to identify dense regions in subspaces of maximum dimensionality. There exist also variants of this algorithm such as PROCLUS (Aggarwal, et al. 1999) and MAFIA (Nagesh et al. 1999). Although these algorithms can be used effectively for numerical attributes, the region definition may not be appropriate for categorical data, mainly because there exists no natural ordering between the values of such attributes. As an extension, Ganti et al. 1999) have defined interval region ....

Nagesh, H., Goil, S., and Choudhary, A.:`MAFIA: Efficient and Scalable Subspace Clustering for Very Large Data Sets', Technical Report 9906-010, Northwestern University, June 1999.


Mixtures of Rectangles: Interpretable Soft Clustering - Pelleg, Moore (2001)   (2 citations)  (Correct)

....membership (whether the datapoint is in a dense region or not) This precludes multiple class memberships. It also requires two user supplied parameters (the resolution of the grid and a density threshold) which are unlikely to be specified correctly for all but expert users and simple densities. Nagesh et al. 1999) try to fix this, but the hard membership assumption still holds in their work. Learning axis parallel boxes and their unions has been discussed in Maass and Warmuth (1995) Note, however, that our algorithm is unsupervised whereas the learning theory work is mainly concerned with supervised ....

Nagesh, H., Goil, S., & Choudhary, A. (1999). MAFIA: Efficient and scalable subspace clustering for very large data sets (Technical Report 9906-010). Northwestern University.


Clustering Algorithms for Spatial Databases: A Survey - Kolatch (2001)   (3 citations)  (Correct)

....In addition, each trigger is decomposed in sub triggers whose values are stored in individual cells. The triggers and sub triggers are evaluated incrementally to decrease the amount of time spent reprocessing data. 6. 2 MAFIA MAFIA (Merging of Adaptive Finite Intervals (And more than a CLIQUE) GHC99] is a modification of CLIQUE that runs faster and finds better quality clusters. The main change is the elimination of the pruning technique for limiting the number of subspaces examined, and the implementation of an adaptive interval size which partitions the dimension dependent on the data ....

Goil, Sanjay, Harasha Nagesh and Alok Choudhary. (1999). MAFIA: Efficient and Scalable Subspace Clustering for Very Large Data Sets. Technical Report Number CPDC-TR-9906-019, Center for Parallel and Distributed Computing, Northwestern University.


Detecting Significant Multidimensional Spatial - Clusters Daniel Neill   (Correct)

No context found.

S. Goil, et al. 1999. MAFIA: efficient and scalable subspace clustering for very large data sets. Northwestern University, Technical Report CPDC-TR-9906-010.


Rapid Detection of Significant Spatial Clusters - Neill, Moore (2004)   (Correct)

No context found.

S. Goil, H. Nagesh, and A. Choudhary. MAFIA: efficient and scalable subspace clustering for very large data sets. Technical Report CPDC-TR-9906-010, Northwestern University, 1999.


A Fast Multi-Resolution Method for Detection of Significant.. - Neill, Moore (2003)   (2 citations)  (Correct)

No context found.

S. Goil, et al. 1999. MAFIA: efficient and scalable subspace clustering for very large data sets. Northwestern University, Technical Report No. CPDC-TR-9906-010.


A Fast Multi-Resolution Method for Detection of.. - Daniel Neill Carnegie (2003)   (2 citations)  (Correct)

No context found.

S. Goil, et al. 1999. MAFIA: efficient and scalable subspace clustering for very large data sets. Northwestern University, Technical Report No. CPDC-TR-9906-010.


Ad hoc Query Support for Very Large Simulation Mesh.. - Lee, Snapp, Musick.. (2001)   (Correct)

No context found.

S. Goil, Harsha, and A. Choudhary, MAFIA: Efficient and Scalable Subspace Clustering for Very Large Datasets, Technical Report, Northwestern University, 1999.


A Requirements Analysis for Parallel KDD Systems - Maniatty, Zaki (2000)   (7 citations)  (Correct)

No context found.

H. Nagesh S. Goil and A. Choudhary. MAFIA: Efficient and scalable subspace clustering for very large data sets. Technical Report 9906-010, Northwestern University, June 1999.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC