Results 1  10
of
762,126
The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Datasets
 JOURNAL OF NETWORK AND COMPUTER APPLICATIONS
, 1999
"... In an increasing number of scientific disciplines, large data collections are emerging as important community resources. In this paper, we introduce design principles for a data management architecture called the Data Grid. We describe two basic services that we believe are fundamental to the des ..."
Abstract

Cited by 469 (42 self)
 Add to MetaCart
In an increasing number of scientific disciplines, large data collections are emerging as important community resources. In this paper, we introduce design principles for a data management architecture called the Data Grid. We describe two basic services that we believe are fundamental
FastMap: A Fast Algorithm for Indexing, DataMining and Visualization of Traditional and Multimedia Datasets
, 1995
"... A very promising idea for fast searching in traditional and multimedia databases is to map objects into points in kd space, using k featureextraction functions, provided by a domain expert [25]. Thus, we can subsequently use highly finetuned spatial access methods (SAMs), to answer several types ..."
Abstract

Cited by 497 (23 self)
 Add to MetaCart
A very promising idea for fast searching in traditional and multimedia databases is to map objects into points in kd space, using k featureextraction functions, provided by a domain expert [25]. Thus, we can subsequently use highly finetuned spatial access methods (SAMs), to answer several types of queries, including the `Query By Example' type (which translates to a range query); the `all pairs' query (which translates to a spatial join [8]); the nearestneighbor or bestmatch query, etc. However, designing feature extraction functions can be hard. It is relatively easier for a domain expert to assess the similarity/distance of two objects. Given only the distance information though, it is not obvious how to map objects into points. This is exactly the topic of this paper. We describe a fast algorithm to map objects into points in some kdimensional space (k is userdefined), such that the dissimilarities are preserved. There are two benefits from this mapping: (a) efficient ret...
Very simple classification rules perform well on most commonly used datasets
 Machine Learning
, 1993
"... The classification rules induced by machine learning systems are judged by two criteria: their classification accuracy on an independent test set (henceforth "accuracy"), and their complexity. The relationship between these two criteria is, of course, of keen interest to the machin ..."
Abstract

Cited by 542 (5 self)
 Add to MetaCart
to the machine learning community. There are in the literature some indications that very simple rules may achieve surprisingly high accuracy on many datasets. For example, Rendell occasionally remarks that many real world datasets have "few peaks (often just one) " and so are &
Community detection in graphs
, 2009
"... The modern science of networks has brought significant advances to our understanding of complex systems. One of the most relevant features of graphs representing real systems is community structure, or clustering, i. e. the organization of vertices in clusters, with many edges joining vertices of th ..."
Abstract

Cited by 801 (1 self)
 Add to MetaCart
The modern science of networks has brought significant advances to our understanding of complex systems. One of the most relevant features of graphs representing real systems is community structure, or clustering, i. e. the organization of vertices in clusters, with many edges joining vertices
Graphs over Time: Densification Laws, Shrinking Diameters and Possible Explanations
, 2005
"... How do real graphs evolve over time? What are “normal” growth patterns in social, technological, and information networks? Many studies have discovered patterns in static graphs, identifying properties in a single snapshot of a large network, or in a very small number of snapshots; these include hea ..."
Abstract

Cited by 534 (48 self)
 Add to MetaCart
How do real graphs evolve over time? What are “normal” growth patterns in social, technological, and information networks? Many studies have discovered patterns in static graphs, identifying properties in a single snapshot of a large network, or in a very small number of snapshots; these include
Estimating the number of clusters in a dataset via the Gap statistic
, 2000
"... We propose a method (the \Gap statistic") for estimating the number of clusters (groups) in a set of data. The technique uses the output of any clustering algorithm (e.g. kmeans or hierarchical), comparing the change in within cluster dispersion to that expected under an appropriate reference ..."
Abstract

Cited by 492 (1 self)
 Add to MetaCart
We propose a method (the \Gap statistic") for estimating the number of clusters (groups) in a set of data. The technique uses the output of any clustering algorithm (e.g. kmeans or hierarchical), comparing the change in within cluster dispersion to that expected under an appropriate reference null distribution. Some theory is developed for the proposal and a simulation study that shows that the Gap statistic usually outperforms other methods that have been proposed in the literature. We also briey explore application of the same technique to the problem for estimating the number of linear principal components. 1 Introduction Cluster analysis is an important tool for \unsupervised" learning the problem of nding groups in data without the help of a response variable. A major challenge in cluster analysis is estimation of the optimal number of \clusters". Figure 1 (top right) shows a typical plot of an error measure W k (the within cluster dispersion dened below) for a clustering pr...
Graphbased algorithms for Boolean function manipulation
 IEEE TRANSACTIONS ON COMPUTERS
, 1986
"... In this paper we present a new data structure for representing Boolean functions and an associated set of manipulation algorithms. Functions are represented by directed, acyclic graphs in a manner similar to the representations introduced by Lee [1] and Akers [2], but with further restrictions on th ..."
Abstract

Cited by 3499 (47 self)
 Add to MetaCart
to the sizes of the graphs being operated on, and hence are quite efficient as long as the graphs do not grow too large. We present experimental results from applying these algorithms to problems in logic design verification that demonstrate the practicality of our approach.
Large margin methods for structured and interdependent output variables
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2005
"... Learning general functional dependencies between arbitrary input and output spaces is one of the key challenges in computational intelligence. While recent progress in machine learning has mainly focused on designing flexible and powerful input representations, this paper addresses the complementary ..."
Abstract

Cited by 612 (12 self)
 Add to MetaCart
that solves the optimization problem in polynomial time for a large class of problems. The proposed method has important applications in areas such as computational biology, natural language processing, information retrieval/extraction, and optical character recognition. Experiments from various domains
Large Margin Classification Using the Perceptron Algorithm
 Machine Learning
, 1998
"... We introduce and analyze a new algorithm for linear classification which combines Rosenblatt 's perceptron algorithm with Helmbold and Warmuth's leaveoneout method. Like Vapnik 's maximalmargin classifier, our algorithm takes advantage of data that are linearly separable with large ..."
Abstract

Cited by 518 (2 self)
 Add to MetaCart
with large margins. Compared to Vapnik's algorithm, however, ours is much simpler to implement, and much more efficient in terms of computation time. We also show that our algorithm can be efficiently used in very high dimensional spaces using kernel functions. We performed some experiments using our
Results 1  10
of
762,126