Results 1 - 10
of
11
New Algorithms for Learning Incoherent and Overcomplete Dictionaries
, 2014
"... In sparse recovery we are given a matrix A ∈ Rn×m (“the dictionary”) and a vector of the form AX where X is sparse, and the goal is to recover X. This is a central notion in signal processing, statistics and machine learning. But in applications such as sparse coding, edge detection, com-pression an ..."
Abstract
-
Cited by 19 (2 self)
- Add to MetaCart
(Show Context)
In sparse recovery we are given a matrix A ∈ Rn×m (“the dictionary”) and a vector of the form AX where X is sparse, and the goal is to recover X. This is a central notion in signal processing, statistics and machine learning. But in applications such as sparse coding, edge detection, com-pression and super resolution, the dictionary A is unknown and has to be learned from random examples of the form Y = AX where X is drawn from an appropriate distribution — this is the dictionary learning problem. In most settings, A is overcomplete: it has more columns than rows. This paper presents a polynomial-time algorithm for learning overcomplete dictionaries; the only previously known algorithm with provable guarantees is the recent work of Spielman et al. (2012) who gave an algorithm for the undercomplete case, which is rarely the case in applications. Our al-gorithm applies to incoherent dictionaries which have been a central object of study since they were introduced in seminal work of Donoho and Huo (1999). In particular, a dictionary is µ-incoherent if each pair of columns has inner product at most µ/ n. The algorithm makes natural stochastic assumptions about the unknown sparse vector X, which can contain k ≤ cmin(√n/µ log n,m1/2−η) non-zero entries (for any η> 0). This is close to the best k allowable by the best sparse recovery algorithms even if one knows the dictio-nary A exactly. Moreover, both the running time and sample complexity depend on log 1/, where is the target accuracy, and so our algorithms converge very quickly to the true dictionary. Our al-gorithm can also tolerate substantial amounts of noise provided it is incoherent with respect to the dictionary (e.g., Gaussian). In the noisy setting, our running time and sample complexity depend polynomially on 1/, and this is necessary.
Modeling and Detecting Community Hierarchies
"... Abstract. Community detection has in recent years emerged as an invaluable tool for describing and quantifying interactions in networks. In this paper we propose a theoretical model that explicitly formalizes both the tight connections within each community and the hierarchical nature of the communi ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
(Show Context)
Abstract. Community detection has in recent years emerged as an invaluable tool for describing and quantifying interactions in networks. In this paper we propose a theoretical model that explicitly formalizes both the tight connections within each community and the hierarchical nature of the communities. We further present an efficient algorithm that provably detects all the communities in our model. Experiments demonstrate that our definition successfully models real world communities, and our algorithm compares favorably with existing approaches.
Provable Algorithms for Machine Learning Problems
, 2013
"... Modern machine learning algorithms can extract useful information from text, images and videos. All these applications involve solving NP-hard problems in average case using heuris-tics. What properties of the input allow it to be solved efficiently? Theoretically analyzing the heuristics is often v ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
Modern machine learning algorithms can extract useful information from text, images and videos. All these applications involve solving NP-hard problems in average case using heuris-tics. What properties of the input allow it to be solved efficiently? Theoretically analyzing the heuristics is often very challenging. Few results were known. This thesis takes a different approach: we identify natural properties of the input, then design new algorithms that provably works assuming the input has these properties. We are able to give new, provable and sometimes practical algorithms for learning tasks related to text corpus, images and social networks. The first part of the thesis presents new algorithms for learning thematic structure in documents. We show under a reasonable assumption, it is possible to provably learn many topic models, including the famous Latent Dirichlet Allocation. Our algorithm is the first provable algorithms for topic modeling. An implementation runs 50 times faster than latest MCMC implementation and produces comparable results. The second part of the thesis provides ideas for provably learning deep, sparse representa-tions. We start with sparse linear representations, and give the first algorithm for dictionary learning problem with provable guarantees. Then we apply similar ideas to deep learning: under reasonable assumptions our algorithms can learn a deep network built by denoising autoencoders. The final part of the thesis develops a framework for learning latent variable models. We demonstrate how various latent variable models can be reduced to orthogonal tensor decomposition, and then be solved using tensor power method. We give a tight perturbation analysis for tensor power method, which reduces the number of samples required to learn many latent variable models. In theory, the assumptions in this thesis help us understand why intractable problems in machine learning can often be solved; in practice, the results suggest inherently new approaches for machine learning. We hope the assumptions and algorithms inspire new research problems and learning algorithms. iii
An Axiomatic Approach to Community Detection
"... ABSTRACT Inspired by social choice theory in voting and other contexts [2], we provide the first axiomatic approach to community identification in social and information networks. We start from an abstract framework, called preference networks What constitutes a community in a social network? With ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
ABSTRACT Inspired by social choice theory in voting and other contexts [2], we provide the first axiomatic approach to community identification in social and information networks. We start from an abstract framework, called preference networks What constitutes a community in a social network? Within this framework, we axiomatically study the formation and structures of communities in two different ways. First, we apply social choice theory and define communities indirectly by postulating that they are fixed points of a preference aggregation function obeying certain desirable axioms. Second, we directly postulate six desirable axioms for communities to satisfy, without reference to preference aggregation. For the second approach, we prove a taxonomy theorem that provides a structural characterization of the family of axiom-conforming community rules as a lattice. We complement this structural theorem with a complexity result, showing that, while for some rules in the lattice, community characterization is straightforward, it is coNPcomplete to characterize subsets according to others. Our study also sheds light on the limitations of defining community rules solely based on preference aggregation, namely that many aggregation functions lead to communities which violate at least one of our community axioms. These include any aggregation function satisfying Arrow's "independence of irrelevant alternatives" axiom, as well as commonly used aggregation schemes like the Borda count or generalizations thereof. Finally, we give a polynomial-time rule consistent with five axioms and weakly satisfying the sixth axiom.
Uncovering the Small Community Structure in Large Networks: A Local Spectral Approach
"... Large graphs arise in a number of contexts and understand-ing their structure and extracting information from them is an important research area. Early algorithms on mining communities have focused on the global structure, and of-ten run in time functional to the size of the entire graph. Nowadays, ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Large graphs arise in a number of contexts and understand-ing their structure and extracting information from them is an important research area. Early algorithms on mining communities have focused on the global structure, and of-ten run in time functional to the size of the entire graph. Nowadays, as we often explore networks with billions of ver-tices and find communities of size hundreds, it is crucial to shift our attention from macroscopic structure to micro-scopic structure when dealing with large networks. A grow-ing body of work has been adopting local expansion methods in order to identify the community from a few exemplary seed members. In this paper, we propose a novel approach for finding overlapping communities called LEMON (Local Expansion
Detecting and Characterizing Small Dense Bipartite-like Subgraphs by the Bipartiteness Ratio Measure
"... ar ..."
(Show Context)
Overlap Graph Clustering via Successive Removal
"... Abstract-One of the fundamental questions in the study of complex networks is community detection, i.e., given a graph that represents interactions in a real system, can we group vertices with similar interests together? In many applications, we are often in a setting where vertices may potentially ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract-One of the fundamental questions in the study of complex networks is community detection, i.e., given a graph that represents interactions in a real system, can we group vertices with similar interests together? In many applications, we are often in a setting where vertices may potentially belong to multiple communities. In this paper, we propose an efficient algorithm for overlapping community detection which can successively recover all the communities. We provide theoretical guarantees on the performance of the algorithm by leveraging convex relaxation and exploiting the fact that in many networks there are often vertices that only belong to one community.
Provably Fast Inference of Latent Features from Networks with Applications to Learning Social Circles and Multilabel Classification
"... A well known phenomenon in social networks is homophily, the tendency of agents to connect with similar agents. A derivative of this phenomenon is the emergence of commu-nities. Another phenomenon observed in numerous networks is the existence of certain agents that belong simultaneously to multiple ..."
Abstract
- Add to MetaCart
(Show Context)
A well known phenomenon in social networks is homophily, the tendency of agents to connect with similar agents. A derivative of this phenomenon is the emergence of commu-nities. Another phenomenon observed in numerous networks is the existence of certain agents that belong simultaneously to multiple communities. An understanding of these phe-nomena constitutes a central research topic of network sci-ence. In this work we focus on a fundamental theoretical ques-tion related to the above phenomena with various applica-tions: given an undirected graph G, can we infer efficiently the latent vertex features which explain the observed net-work structure under the assumption of a generative model that exhibits homophily? We propose a probabilistic gener-ative model with the property that the probability of an edge among two vertices is a non-decreasing function of the common features they possess. This property is true for many real-world networks and surprisingly is ignored by many popular overlapping community detection methods as it was shown recently by the empirical work of Yang and Leskovec [44]. Our main theoretical contribution is the first provably rapidly mixing Markov chain for inferring latent features. On the experimental side, we verify the efficiency of our method in terms of run times, where we observe that it significantly outperforms state-of-the-art methods. Our method is more than 2 400 times faster than a state-of-the-art machine learning method [37] and typically provides non-trivial speedups compared to BigClam [43]. Furthermore, we verify on real-data with ground-truth available that our method learns efficiently high quality labelings. We use our method to learn social circles from Twitter ego-networks and perform multilabel classification.
Fixed-Points of Social Choice: An Axiomatic Approach to Network Communities
, 2014
"... ar ..."
(Show Context)