Results 1  10
of
23,510
MapReduce: Simplified data processing on large clusters.
 In Proceedings of the Sixth Symposium on Operating System Design and Implementation (OSDI04),
, 2004
"... Abstract MapReduce is a programming model and an associated implementation for processing and generating large data sets. Programs written in this functional style are automatically parallelized and executed on a large cluster of commodity machines. The runtime system takes care of the details of ..."
Abstract

Cited by 3439 (3 self)
 Add to MetaCart
Abstract MapReduce is a programming model and an associated implementation for processing and generating large data sets. Programs written in this functional style are automatically parallelized and executed on a large cluster of commodity machines. The runtime system takes care of the details
Clustering processes
"... The problem of clustering is considered, for the case when each data point is a sample generated by a stationary ergodic process. We propose a very natural asymptotic notion of consistency, and show that simple consistent algorithms exist, under most general nonparametric assumptions. The notion of ..."
Abstract

Cited by 17 (14 self)
 Add to MetaCart
The problem of clustering is considered, for the case when each data point is a sample generated by a stationary ergodic process. We propose a very natural asymptotic notion of consistency, and show that simple consistent algorithms exist, under most general nonparametric assumptions. The notion
Mean shift, mode seeking, and clustering
 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 1995
"... Mean shift, a simple iterative procedure that shifts each data point to the average of data points in its neighborhood, is generalized and analyzed in this paper. This generalization makes some kmeans like clustering algorithms its special cases. It is shown that mean shift is a modeseeking proce ..."
Abstract

Cited by 624 (0 self)
 Add to MetaCart
seeking process on a surface constructed with a “shadow ” kernel. For Gaussian kernels, mean shift is a gradient mapping. Convergence is studied for mean shift iterations. Cluster analysis is treated as a deterministic problem of finding a fixed point of mean shift that characterizes the data. Applications
Cluster analysis and display of genomewide expression patterns’,
 Proc. Natl. Acad.
, 1998
"... ABSTRACT A system of cluster analysis for genomewide expression data from DNA microarray hybridization is described that uses standard statistical algorithms to arrange genes according to similarity in pattern of gene expression. The output is displayed graphically, conveying the clustering and th ..."
Abstract

Cited by 2895 (44 self)
 Add to MetaCart
ABSTRACT A system of cluster analysis for genomewide expression data from DNA microarray hybridization is described that uses standard statistical algorithms to arrange genes according to similarity in pattern of gene expression. The output is displayed graphically, conveying the clustering
Clustering by passing messages between data points
 Science
, 2007
"... Clustering data by identifying a subset of representative examples is important for processing sensory signals and detecting patterns in data. Such “exemplars ” can be found by randomly choosing an initial subset of data points and then iteratively refining it, but this works well only if that initi ..."
Abstract

Cited by 696 (8 self)
 Add to MetaCart
Clustering data by identifying a subset of representative examples is important for processing sensory signals and detecting patterns in data. Such “exemplars ” can be found by randomly choosing an initial subset of data points and then iteratively refining it, but this works well only
OPTICS: Ordering Points To Identify the Clustering Structure
, 1999
"... Cluster analysis is a primary method for database mining. It is either used as a standalone tool to get insight into the distribution of a data set, e.g. to focus further analysis and data processing, or as a preprocessing step for other algorithms operating on the detected clusters. Almost all of ..."
Abstract

Cited by 527 (51 self)
 Add to MetaCart
Cluster analysis is a primary method for database mining. It is either used as a standalone tool to get insight into the distribution of a data set, e.g. to focus further analysis and data processing, or as a preprocessing step for other algorithms operating on the detected clusters. Almost all
Hierarchical Dirichlet processes.
 Journal of the American Statistical Association,
, 2006
"... We consider problems involving groups of data where each observation within a group is a draw from a mixture model and where it is desirable to share mixture components between groups. We assume that the number of mixture components is unknown a priori and is to be inferred from the data. In this s ..."
Abstract

Cited by 942 (78 self)
 Add to MetaCart
. In this setting it is natural to consider sets of Dirichlet processes, one for each group, where the wellknown clustering property of the Dirichlet process provides a nonparametric prior for the number of mixture components within each group. Given our desire to tie the mixture models in the various groups, we
Unsupervised texture segmentation using Gabor filters
 Pattern Recognition
"... We presenf a texture segmentation algorithm inspired by the multichannel filtering theory for visual information processing in the early stages of human visual system. The channels are characterized by a bank of Gabor filters that nearly uniformly covers the spatialfrequency domain. We propose a s ..."
Abstract

Cited by 616 (20 self)
 Add to MetaCart
emr clustering algorithm is then used to integrate the feature images and produce a segmentation. A simple procedure to incorporate spatial adjacency information in the clustering process is also proposed. We report experiments on images with natural textures as well as artificial textures with identical 2nd
Pregel: A system for largescale graph processing
 IN SIGMOD
, 2010
"... Many practical computing problems concern large graphs. Standard examples include the Web graph and various social networks. The scale of these graphs—in some cases billions of vertices, trillions of edges—poses challenges to their efficient processing. In this paper we present a computational model ..."
Abstract

Cited by 496 (0 self)
 Add to MetaCart
Many practical computing problems concern large graphs. Standard examples include the Web graph and various social networks. The scale of these graphs—in some cases billions of vertices, trillions of edges—poses challenges to their efficient processing. In this paper we present a computational
KSVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation
, 2006
"... In recent years there has been a growing interest in the study of sparse representation of signals. Using an overcomplete dictionary that contains prototype signalatoms, signals are described by sparse linear combinations of these atoms. Applications that use sparse representation are many and inc ..."
Abstract

Cited by 935 (41 self)
 Add to MetaCart
signal representations. Given a set of training signals, we seek the dictionary that leads to the best representation for each member in this set, under strict sparsity constraints. We present a new method—the KSVD algorithm—generalizing the umeans clustering process. KSVD is an iterative method
Results 1  10
of
23,510