Results 1  10
of
990
ROCK: A Robust Clustering Algorithm for Categorical Attributes
 In Proc.ofthe15thInt.Conf.onDataEngineering
, 2000
"... Clustering, in data mining, is useful to discover distribution patterns in the underlying data. Clustering algorithms usually employ a distance metric based (e.g., euclidean) similarity measure in order to partition the database such that data points in the same partition are more similar than point ..."
Abstract

Cited by 446 (2 self)
 Add to MetaCart
Clustering, in data mining, is useful to discover distribution patterns in the underlying data. Clustering algorithms usually employ a distance metric based (e.g., euclidean) similarity measure in order to partition the database such that data points in the same partition are more similar than points in different partitions. In this paper, we study clustering algorithms for data with boolean and categorical attributes. We show that traditional clustering algorithms that use distances between points for clustering are not appropriate for boolean and categorical attributes. Instead, we propose a novel concept of links to measure the similarity/proximity between a pair of data points. We develop a robust hierarchical clustering algorithm ROCK that employs links and not distances when merging clusters.
SelfTesting/Correcting with Applications to Numerical Problems
, 1990
"... Suppose someone gives us an extremely fast program P that we can call as a black box to compute a function f . Should we trust that P works correctly? A selftesting/correcting pair allows us to: (1) estimate the probability that P (x) 6= f(x) when x is randomly chosen; (2) on any input x, compute ..."
Abstract

Cited by 361 (27 self)
 Add to MetaCart
Suppose someone gives us an extremely fast program P that we can call as a black box to compute a function f . Should we trust that P works correctly? A selftesting/correcting pair allows us to: (1) estimate the probability that P (x) 6= f(x) when x is randomly chosen; (2) on any input x, compute f(x) correctly as long as P is not too faulty on average. Furthermore, both (1) and (2) take time only slightly more than Computer Science Division, U.C. Berkeley, Berkeley, California 94720, Supported by NSF Grant No. CCR 8813632. y International Computer Science Institute, Berkeley, California 94704 z Computer Science Division, U.C. Berkeley, Berkeley, California 94720, Supported by an IBM Graduate Fellowship and NSF Grant No. CCR 8813632. the original running time of P . We present general techniques for constructing simple to program selftesting /correcting pairs for a variety of numerical problems, including integer multiplication, modular multiplication, matrix multiplicatio...
Polynomial time algorithms for multicast network code construction
 IEEE TRANS. ON INFO. THY
, 2005
"... The famous maxflow mincut theorem states that a source node can send information through a network ( ) to a sink node at a rate determined by the mincut separating and. Recently, it has been shown that this rate can also be achieved for multicasting to several sinks provided that the intermediat ..."
Abstract

Cited by 316 (29 self)
 Add to MetaCart
(Show Context)
The famous maxflow mincut theorem states that a source node can send information through a network ( ) to a sink node at a rate determined by the mincut separating and. Recently, it has been shown that this rate can also be achieved for multicasting to several sinks provided that the intermediate nodes are allowed to reencode the information they receive. We demonstrate examples of networks where the achievable rates obtained by coding at intermediate nodes are arbitrarily larger than if coding is not allowed. We give deterministic polynomial time algorithms and even faster randomized algorithms for designing linear codes for directed acyclic graphs with edges of unit capacity. We extend these algorithms to integer capacities and to codes that are tolerant to edge failures.
Algorithms for Sequential Decision Making
, 1996
"... Sequential decision making is a fundamental task faced by any intelligent agent in an extended interaction with its environment; it is the act of answering the question "What should I do now?" In this thesis, I show how to answer this question when "now" is one of a finite set of ..."
Abstract

Cited by 213 (8 self)
 Add to MetaCart
(Show Context)
Sequential decision making is a fundamental task faced by any intelligent agent in an extended interaction with its environment; it is the act of answering the question "What should I do now?" In this thesis, I show how to answer this question when "now" is one of a finite set of states, "do" is one of a finite set of actions, "should" is maximize a longrun measure of reward, and "I" is an automated planning or learning system (agent). In particular,
Efficient influence maximization in social networks
 In Proc. of ACM KDD
, 2009
"... Influence maximization is the problem of finding a small subset of nodes (seed nodes) in a social network that could maximize the spread of influence. In this paper, we study the efficient influence maximization from two complementary directions. One is to improve the original greedy algorithm of [5 ..."
Abstract

Cited by 197 (18 self)
 Add to MetaCart
(Show Context)
Influence maximization is the problem of finding a small subset of nodes (seed nodes) in a social network that could maximize the spread of influence. In this paper, we study the efficient influence maximization from two complementary directions. One is to improve the original greedy algorithm of [5] and its improvement [7] to further reduce its running time, and the second is to propose new degree discount heuristics that improves influence spread. We evaluate our algorithms by experiments on two large academic collaboration graphs obtained from the online archival database arXiv.org. Our experimental results show that (a) our improved greedy algorithm achieves better running time comparing with the improvement of [7] with matching influence spread, (b) our degree discount heuristics achieve much better influence spread than classic degree and centralitybased heuristics, and when tuned for a specific influence cascade model, it achieves almost matching influence thread with the greedy algorithm, and more importantly (c) the degree discount heuristics run only in milliseconds while even the improved greedy algorithms run in hours in our experiment graphs with a few tens of thousands of nodes. Based on our results, we believe that finetuned heuristics may provide truly scalable solutions to the influence maximization problem with satisfying influence spread and blazingly fast running time. Therefore, contrary to what implied by the conclusion of [5] that traditional heuristics are outperformed by the greedy approximation algorithm, our results shed new lights on the research of heuristic algorithms.
Improved approximation algorithms for large matrices via random projections.
 In Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
, 2006
"... ..."
(Show Context)
On the complexity of solving Markov decision problems
 IN PROC. OF THE ELEVENTH INTERNATIONAL CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE
, 1995
"... Markov decision problems (MDPs) provide the foundations for a number of problems of interest to AI researchers studying automated planning and reinforcement learning. In this paper, we summarize results regarding the complexity of solving MDPs and the running time of MDP solution algorithms. We argu ..."
Abstract

Cited by 162 (11 self)
 Add to MetaCart
Markov decision problems (MDPs) provide the foundations for a number of problems of interest to AI researchers studying automated planning and reinforcement learning. In this paper, we summarize results regarding the complexity of solving MDPs and the running time of MDP solution algorithms. We argue that, although MDPs can be solved efficiently in theory, more study is needed to reveal practical algorithms for solving large problems quickly. To encourage future research, we sketch some alternative methods of analysis that rely on the structure of MDPs.
Multiplying matrices faster than coppersmithwinograd
 In Proc. 44th ACM Symposium on Theory of Computation
, 2012
"... We develop new tools for analyzing matrix multiplication constructions similar to the CoppersmithWinograd construction, and obtain a new improved bound on ω < 2.3727. 1 ..."
Abstract

Cited by 150 (9 self)
 Add to MetaCart
(Show Context)
We develop new tools for analyzing matrix multiplication constructions similar to the CoppersmithWinograd construction, and obtain a new improved bound on ω < 2.3727. 1
Hidden Field Equations (HFE) and Isomorphisms of Polynomials (IP): two new Families of Asymmetric Algorithms
, 1996
"... In [11] T. Matsumoto and H. Imai described a new asymmetric algorithm based on multivariate polynomials of degree twoo ver a finite field. Then in [14] this algorithm was broken. The aim of this paper is to show that despite this result it is probably possible to use multivariate polynomials of degr ..."
Abstract

Cited by 150 (9 self)
 Add to MetaCart
In [11] T. Matsumoto and H. Imai described a new asymmetric algorithm based on multivariate polynomials of degree twoo ver a finite field. Then in [14] this algorithm was broken. The aim of this paper is to show that despite this result it is probably possible to use multivariate polynomials of degree two in carefully designed algorithms for asymmetric cryptography. In this paper we will give some examples of suchschemes. All the examples that we will give, belong to two large family of schemes: HFE and IP. With HFE we will be able to do encryption, signatures or authentication in an asymmetric way. Moreover HFE (with properly chosen parameters) resist to all known attacks and can be used in order to givevery short asymmetric signatures or very short encrypted messages (of length 128 bits or 64 bits for example). IP can be used for asymmetric authentications or signatures. IP authentications are zero knowledge.
Generating Random Spanning Trees More Quickly than the Cover Time
 PROCEEDINGS OF THE TWENTYEIGHTH ANNUAL ACM SYMPOSIUM ON THE THEORY OF COMPUTING
, 1996
"... ..."