Results 1 
9 of
9
Decoding binary node labels from censored edge measurements: Phase transition and efficient recovery
, 2014
"... We consider the problem of clustering a graphG into two communities by observing a subset of the vertex correlations. Specifically, we consider the inverse problem with observed variables Y = BGx⊕Z, where BG is the incidence matrix of a graph G, x is the vector of unknown vertex variables (with a ..."
Abstract

Cited by 12 (6 self)
 Add to MetaCart
(Show Context)
We consider the problem of clustering a graphG into two communities by observing a subset of the vertex correlations. Specifically, we consider the inverse problem with observed variables Y = BGx⊕Z, where BG is the incidence matrix of a graph G, x is the vector of unknown vertex variables (with a uniform prior) and Z is a noise vector with Bernoulli(ε) i.i.d. entries. All variables and operations are Boolean. This model is motivated by coding, synchronization, and community detection problems. In particular, it corresponds to a stochastic block model or a correlation clustering problem with two communities and censored edges. Without noise, exact recovery (up to global flip) of x is possible if and only the graph G is connected, with a sharp threshold at the edge probability log(n)/n for ErdősRényi random graphs. The first goal of this paper is to determine how the edge probability p needs to scale to allow exact recovery in the presence of noise. Defining the degree (oversampling) rate of the graph by α = np / log(n), it is shown that exact recovery is possible if and only if α> 2/(1 − 2ε)2 + o(1/(1 − 2ε)2). In other words, 2/(1 − 2ε)2 is the information theoretic threshold for exact recovery at lowSNR. In addition, an efficient recovery algorithm based on semidefinite programming is proposed and shown to succeed in the threshold regime up to twice the optimal rate. For a deterministic graph G, defining the degree rate as α = d / log(n), where d is the minimum degree of the graph, it is shown that the proposed method achieves the rate α> 4((1 + λ)/(1 − λ)2)/(1 − 2ε)2 + o(1/(1 − 2ε)2), where 1 − λ is the spectral gap of the graph G. A preliminary version of this paper appeared in ISIT 2014 [ABBS14].
Linear inverse problems on ErdősRényi graphs: Informationtheoretic limits and efficient recovery
"... Abstract—This paper considers the inverse problem with observed variables Y = BGX ⊕Z, where BG is the incidence matrix of a graph G, X is the vector of unknown vertex variables with a uniform prior, and Z is a noise vector with Bernoulli(ε) i.i.d. entries. All variables and operations are Boolean. T ..."
Abstract

Cited by 7 (5 self)
 Add to MetaCart
(Show Context)
Abstract—This paper considers the inverse problem with observed variables Y = BGX ⊕Z, where BG is the incidence matrix of a graph G, X is the vector of unknown vertex variables with a uniform prior, and Z is a noise vector with Bernoulli(ε) i.i.d. entries. All variables and operations are Boolean. This model is motivated by coding, synchronization, and community detection problems. In particular, it corresponds to a stochastic block model or a correlation clustering problem with two communities and censored edges. Without noise, exact recovery of X is possible if and only the graph G is connected, with a sharp threshold at the edge probability log(n)/n for ErdősRényi random graphs. The first goal of this paper is to determine how the edge probability p needs to scale to allow exact recovery in the presence of noise. Defining the degree (oversampling) rate of the graph by α = np / log(n), it is shown that exact recovery is possible if and only if α> 2/(1−2ε)2+o(1/(1−2ε)2). In other words, 2/(1−2ε)2 is the information theoretic threshold for exact recovery at lowSNR. In addition, an efficient recovery algorithm based on semidefinite programming is proposed and shown to succeed in the threshold regime up to twice the optimal rate. Full version available in [1]. I.
Asymptotic Mutual Information for the TwoGroups Stochastic Block Model
, 2015
"... We develop an informationtheoretic view of the stochastic block model, a popular statistical model for the largescale structure of complex networks. A graph G from such a model is generated by first assigning vertex labels at random from a finite alphabet, and then connecting vertices with edge pr ..."
Abstract
 Add to MetaCart
We develop an informationtheoretic view of the stochastic block model, a popular statistical model for the largescale structure of complex networks. A graph G from such a model is generated by first assigning vertex labels at random from a finite alphabet, and then connecting vertices with edge probabilities depending on the labels of the endpoints. In the case of the symmetric twogroup model, we establish an explicit ‘singleletter’ characterization of the pervertex mutual information between the vertex labels and the graph. The explicit expression of the mutual information is intimately related to estimationtheoretic quantities, and –in particular – reveals a phase transition at the critical point for community detection. Below the critical point the pervertex mutual information is asymptotically the same as if edges were independent. Correspondingly, no algorithm can estimate the partition better than random guessing. Conversely, above the threshold, the pervertex mutual information is strictly smaller than the independentedges upper bound. In this regime there exists a procedure that estimates the vertex labels better than random guessing.
ISIT 2015 Tutorial: Information Theory and Machine Learning
"... Abstract We are in the midst of a data deluge, with an explosion in the volume and richness of data sets in fields including social networks, biology, natural language processing, and computer vision, among others. In all of these areas, machine learning has been extraordinarily successful in provi ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract We are in the midst of a data deluge, with an explosion in the volume and richness of data sets in fields including social networks, biology, natural language processing, and computer vision, among others. In all of these areas, machine learning has been extraordinarily successful in providing tools and practical algorithms for extracting information from massive data sets (e.g., genetics, multispectral imaging, Google and FaceBook). Despite this tremendous practical success, relatively less attention has been paid to fundamental limits and tradeoffs, and information theory has a crucial role to play in this context. The goal of this tutorial is to demonstrate how informationtheoretic techniques and concepts can be brought to bear on machine learning problems in unorthodox and fruitful ways. We discuss how any learning problem can be formalized in a Shannontheoretic sense, albeit one that involves nontraditional notions of codewords and channels. This perspective allows informationtheoretic toolsincluding information measures, Fano's inequality, random coding arguments, and so onto be brought to bear on learning problems. We illustrate this broad perspective with discussions of several learning problems, including sparse approximation, dimensionality reduction, graph recovery, clustering, and community detection. We emphasise recent results establishing the fundamental limits of graphical model learning and community detection. We also discuss the distinction between the learningtheoretic capacity when arbitrary "decoding" algorithms are allowed, and notions of computationallyconstrained capacity. Finally, a number of open problems and conjectures at the interface of information theory and machine learning will be discussed.
Volume xx (200y), Number z, pp. 1–15 Partial Functional Correspondence
"... Figure 1: Partial functional correspondence between two pairs of shapes with large missing parts. For each pair we show the matrix C representing the functional map in the spectral domain, and the action of the map by transferring colors from one shape to the other. The special slanteddiagonal stru ..."
Abstract
 Add to MetaCart
Figure 1: Partial functional correspondence between two pairs of shapes with large missing parts. For each pair we show the matrix C representing the functional map in the spectral domain, and the action of the map by transferring colors from one shape to the other. The special slanteddiagonal structure of C induced by the partiality transformation is first estimated from spectral properties of the two shapes, and then exploited to drive the matching process. In this paper, we propose a method for computing partial functional correspondence between nonrigid shapes. We use perturbation analysis to show how removal of shape parts changes the LaplaceBeltrami eigenfunctions, and exploit it as a prior on the spectral representation of the correspondence. Corresponding parts are optimization variables in our problem and are used to weight the functional correspondence; we are looking for the largest and most regular (in the MumfordShah sense) parts that minimize correspondence distortion. We show that our approach can cope with very challenging correspondence settings.
Volume xx (200y), Number z, pp. 1–12 Consistent Partial Matching of Shape Collections via Sparse Modeling
"... Figure 1: A partial multiway correspondence obtained with our approach on a heterogeneous collection of shapes. Our method does not require initial pairwise maps as input, as it actively seeks a reliable correspondence by operating directly over the space of joint, cycleconsistent matches. Partial ..."
Abstract
 Add to MetaCart
(Show Context)
Figure 1: A partial multiway correspondence obtained with our approach on a heterogeneous collection of shapes. Our method does not require initial pairwise maps as input, as it actively seeks a reliable correspondence by operating directly over the space of joint, cycleconsistent matches. Partiallysimilar as well as outlier shapes are automatically detected and accounted for by adopting a sparse model for the joint correspondence. A subset of all matches is shown for visualization purposes. Recent efforts in the area of joint object matching approach the problem by taking as input a set of pairwise maps, which are then jointly optimized across the whole collection so that certain accuracy and consistency criteria are satisfied. One natural requirement is cycleconsistency – namely the fact that map composition should give the same result regardless of the path taken in the shape collection. In this paper, we introduce a novel approach to obtain consistent matches without requiring initial pairwise solutions to be given as input. We do so by optimizing a joint measure of metric distortion directly over the space of cycleconsistent maps; in order to allow for partiallysimilar and extraclass shapes, we formulate the problem as a series of quadratic programs with sparsityinducing constraints, making our technique a natural candidate for analyzing collections with a large presence of outliers. The particular form of the problem allows us to leverage results and tools from the field of evolutionary game theory. This enables a highly efficient optimization procedure which assures accurate and provably consistent solutions in a matter of minutes in collections with hundreds of shapes.
Permutation Diffusion Maps with Application to the Image Association Problem in Computer Vision
"... Consistently matching keypoints across images, and the related problem of finding clusters of nearby images, are critical components of various tasks in Computer Vision, including Structure from Motion (SfM). Unfortunately, occlusion and large repetitive structures tend to mislead most currently u ..."
Abstract
 Add to MetaCart
(Show Context)
Consistently matching keypoints across images, and the related problem of finding clusters of nearby images, are critical components of various tasks in Computer Vision, including Structure from Motion (SfM). Unfortunately, occlusion and large repetitive structures tend to mislead most currently used matching algorithms, leading to characteristic pathologies in the final output. In this paper we propose a new method, Permutations Diffusion Maps (PDM), and a related new affinity measure, Permutation Diffusion Affinity (PDA), to solve this problem. PDM is inspired by Vector Diffusion Maps, recently introduced by Singer and Wu, and uses ideas from the theory of Fourier analysis on the symmetric group. We show that when dealing with difficult datasets, using PDM as a preprocessing step to existing SfM pipelines can significantly improve results. 1
DataDriven Shape Analysis and Processing
, 2015
"... Datadriven methods serve an increasingly important role in discovering geometric, structural, and semantic relationships between shapes. In contrast to traditional approaches that process shapes in isolation of each other, datadriven methods aggregate information from 3D model collections to impro ..."
Abstract
 Add to MetaCart
Datadriven methods serve an increasingly important role in discovering geometric, structural, and semantic relationships between shapes. In contrast to traditional approaches that process shapes in isolation of each other, datadriven methods aggregate information from 3D model collections to improve the analysis, modeling and editing of shapes. Datadriven methods are also able to learn computational models that reason about properties and relationships of shapes without relying on hardcoded rules or explicitly programmed instructions. Through reviewing the literature, we provide an overview of the main concepts and components of these methods, as well as discuss their application to classification, segmentation, matching, reconstruction, modeling and exploration, as well as scene analysis and synthesis. We conclude our report with ideas that can inspire future research in datadriven shape analysis and processing.