Results 1 
6 of
6
Incoherenceoptimal matrix completion
, 2013
"... This paper considers the matrix completion problem. We show that it is not necessary to assume joint incoherence, which is a standard but unintuitive and restrictive condition that is imposed by previous studies. This leads to a sample complexity bound that is orderwise optimal with respect to the ..."
Abstract

Cited by 16 (3 self)
 Add to MetaCart
(Show Context)
This paper considers the matrix completion problem. We show that it is not necessary to assume joint incoherence, which is a standard but unintuitive and restrictive condition that is imposed by previous studies. This leads to a sample complexity bound that is orderwise optimal with respect to the incoherence parameter (as well as to the rank r and the matrix dimension n, except for a log n factor). As a consequence, we improve the sample complexity of recovering a semidefinite matrix from O(nr2 log2 n) to O(nr log2 n), and the highest allowable rank from Θ( n / log n) to Θ(n / log2 n). The key step in proof is to obtain new bounds on the `∞,2norm, defined as the maximum of the row and column norms of a matrix. To demonstrate the applicability of our techniques, we discuss extensions to SVD projection, semisupervised clustering and structured matrix completion. Finally, we turn to the lowrankplussparse matrix decomposition problem, and show that the joint incoherence condition is unavoidable here conditioned on computational complexity assumptions on the classical planted clique problem. This means that it is intractable in general to separate a rankω( n) positive semidefinite matrix and a sparse matrix. 1
PU Learning for Matrix Completion
 Inderjit S. Dhillon Dept of Computer Science UT Austin LowRank Bilinear Prediction
, 2015
"... In this paper, we consider the matrix completion problem when the observations are onebit measurements of some underlying matrix M, and in particular the observed samples consist only of ones and no zeros. This problem is motivated by modern applications such as recommender systems and social net ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
In this paper, we consider the matrix completion problem when the observations are onebit measurements of some underlying matrix M, and in particular the observed samples consist only of ones and no zeros. This problem is motivated by modern applications such as recommender systems and social networks where only “likes ” or “friendships ” are observed. The problem is an instance of PU (positiveunlabeled) learning, i.e. learning from only positive and unlabeled examples that has been studied in the context of binary classification. Under the assumption thatM has bounded nuclear norm, we provide recovery guarantees for two different observation models: 1) M parameterizes a distribution that generates a binary matrix, 2) M is thresholded to obtain a binary matrix. For the first case, we propose a “shifted matrix completion ” method that recovers M using only a subset of indices corresponding to ones; for the second case, we propose a “biased matrix completion ” method that recovers the (thresholded) binary matrix. Both methods yield strong error bounds — if M ∈ Rn×n, the error is bounded as O
Interactively Guiding SemiSupervised Clustering via Attributebased Explanations Shrenik Lad
, 2015
"... Unsupervised image clustering is a challenging and often illposed problem. Existing image descriptors fail to capture the clustering criterion well, and more importantly, the criterion itself may depend on (unknown) user preferences. Semisupervised approaches such as distance metric learning and ..."
Abstract
 Add to MetaCart
(Show Context)
Unsupervised image clustering is a challenging and often illposed problem. Existing image descriptors fail to capture the clustering criterion well, and more importantly, the criterion itself may depend on (unknown) user preferences. Semisupervised approaches such as distance metric learning and constrained clustering thus leverage userprovided annotations indicating which pairs of images belong to the same cluster (mustlink) and which ones do not (cannotlink). These approaches require many such constraints before achieving good clustering performance because each constraint only provides weak cues about the desired clustering. In this work, we propose to use image attributes as a modality for the user to provide more informative cues. In particular, the clustering algorithm iteratively and actively queries a user with an image pair. Instead of the user simply providing a mustlink/cannotlink constraint for the pair, the user also provides an attributebased reasoning e.g. “these two images are similar because both are natural and have still water ” or “these two people are dissimilar because one is way older than the other”. Under the guidance of this explanation, and equipped with attribute predictors, many additional constraints are automatically generated. We demonstrate the effectiveness of our approach by incorporating the proposed attributebased explanations in three standard semisupervised clustering algorithms: Constrained KMeans, MPCKMeans, and Spectral Clustering, on three domains: scenes, shoes, and faces, using both binary and relative attributes.
An Efficient SemiSupervised Clustering Algorithm with Sequential Constraints
"... Semisupervised clustering leverages side information such as pairwise constraints to guide clustering procedures. Despite promising progress, existing semisupervised clustering approaches overlook the condition of side information being generated sequentially, which is a natural setting arising ..."
Abstract
 Add to MetaCart
(Show Context)
Semisupervised clustering leverages side information such as pairwise constraints to guide clustering procedures. Despite promising progress, existing semisupervised clustering approaches overlook the condition of side information being generated sequentially, which is a natural setting arising in numerous realworld applications such as social network and ecommerce system analysis. Given emerged new constraints, classical semisupervised clustering algorithms need to reoptimize their objectives over all data samples and constraints in availability, which prevents them from efficiently updating the obtained data partitions. To address this challenge, we propose an efficient dynamic semisupervised clustering framework that casts the clustering problem into a search problem over a feasible convex set, i.e., a convex
Extracting Certainty from Uncertainty: Transductive Pairwise Classification from Pairwise Similarities
"... In this work, we study the problem of transductive pairwise classification from pairwise similarities 1. The goal of transductive pairwise classification from pairwise similarities is to infer the pairwise class relationships, to which we refer as pairwise labels, between all examples given a subse ..."
Abstract
 Add to MetaCart
(Show Context)
In this work, we study the problem of transductive pairwise classification from pairwise similarities 1. The goal of transductive pairwise classification from pairwise similarities is to infer the pairwise class relationships, to which we refer as pairwise labels, between all examples given a subset of class relationships for a small set of examples, to which we refer as labeled examples. We propose a very simple yet effective algorithm that consists of two simple steps: the first step is to complete the submatrix corresponding to the labeled examples and the second step is to reconstruct the label matrix from the completed submatrix and the provided similarity matrix. Our analysis exhibits that under several mild preconditions we can recover the label matrix with a small error, if the top eigenspace that corresponds to the largest eigenvalues of the similarity matrix covers well the column space of label matrix and is subject to a low coherence, and the number of observed pairwise labels is sufficiently enough. We demonstrate the effectiveness of the proposed algorithm by several experiments. 1
Learning Concept Embeddings with Combined HumanMachine Expertise
"... This paper presents our work on “SNaCK, ” a lowdimensional concept embedding algorithm that combines human expertise with automatic machine similarity kernels. Both parts are complimentary: human insight can capture relationships that are not apparent from the object’s visual similarity and the mac ..."
Abstract
 Add to MetaCart
(Show Context)
This paper presents our work on “SNaCK, ” a lowdimensional concept embedding algorithm that combines human expertise with automatic machine similarity kernels. Both parts are complimentary: human insight can capture relationships that are not apparent from the object’s visual similarity and the machine can help relieve the human from having to exhaustively specify many constraints. We show that our SNaCK embeddings are useful in several tasks: distinguishing prime and nonprime numbers on MNIST, discovering labeling mistakes in the Caltech UCSD Birds (CUB) dataset with the help of deeplearned features, creating training datasets for bird classifiers, capturing subjective human taste on a new dataset of 10,000 foods, and qualitatively exploring an unstructured set of pictographic characters. Comparisons with the stateoftheart in these tasks show that SNaCK produces better concept embeddings that require less human supervision than the leading methods. 1