Results 1  10
of
79
Bayesian inference and optimal design in the sparse linear model
 Workshop on Artificial Intelligence and Statistics
"... The linear model with sparsityfavouring prior on the coefficients has important applications in many different domains. In machine learning, most methods to date search for maximum a posteriori sparse solutions and neglect to represent posterior uncertainties. In this paper, we address problems of ..."
Abstract

Cited by 110 (12 self)
 Add to MetaCart
The linear model with sparsityfavouring prior on the coefficients has important applications in many different domains. In machine learning, most methods to date search for maximum a posteriori sparse solutions and neglect to represent posterior uncertainties. In this paper, we address problems of Bayesian optimal design (or experiment planning), for which accurate estimates of uncertainty are essential. To this end, we employ expectation propagation approximate inference for the linear model with Laplace prior, giving new insight into numerical stability properties and proposing a robust algorithm. We also show how to estimate model hyperparameters by empirical Bayesian maximisation of the marginal likelihood, and propose ideas in order to scale up the method to very large underdetermined problems. We demonstrate the versatility of our framework on the application of gene regulatory network identification from microarray expression data, where both the Laplace prior and the active experimental design approach are shown to result in significant improvements. We also address the problem of sparse coding of natural images, and show how our framework can be used for compressive sensing tasks. Part of this work appeared in Seeger et al. (2007b). The gene network identification application appears in Steinke et al. (2007).
Composite Objective Mirror Descent
"... We present a new method for regularized convex optimization and analyze it under both online and stochastic optimization settings. In addition to unifying previously known firstorder algorithms, such as the projected gradient method, mirror descent, and forwardbackward splitting, our method yields n ..."
Abstract

Cited by 66 (9 self)
 Add to MetaCart
We present a new method for regularized convex optimization and analyze it under both online and stochastic optimization settings. In addition to unifying previously known firstorder algorithms, such as the projected gradient method, mirror descent, and forwardbackward splitting, our method yields new analysis and algorithms. We also derive specific instantiations of our method for commonly used regularization functions, such as ℓ1, mixed norm, and tracenorm. 1
On the Capacity of Multiple Input Multiple Output Broadcast Channels
 In Proceedings of Int. Conf. Commun
, 2002
"... We consider a twouser multiple input multiple output (MIMO) Gaussian broadcast channel (BC), where the transmitter has t transmit antennas and receivers have r1 ; r2 antennas respectively. Since the MIMO broadcast channel is in general a nondegraded broadcast channel, its capacity region remains a ..."
Abstract

Cited by 53 (12 self)
 Add to MetaCart
We consider a twouser multiple input multiple output (MIMO) Gaussian broadcast channel (BC), where the transmitter has t transmit antennas and receivers have r1 ; r2 antennas respectively. Since the MIMO broadcast channel is in general a nondegraded broadcast channel, its capacity region remains an unsolved problem. In this paper, we establish a duality between what is termed the \dirty paper" region (or the CostaCaireShamaiYu achievable region) [5, 7] for the MIMO broadcast channel and the capacity region of the the MIMO multipleaccess channel (MAC), which is easy to compute. Using this duality, we greatly reduce the computation complexity required for obtaining the dirty paper achievable region for the MIMO BC. The duality also enables us to translate previously known results for the MIMO MAC (like iterative waterlling [7]) to the MIMO BC. We show that the dirty paper achievable region achieves the sumrate capacity of the MIMO BC by establishing that the sumrate point in this region equals an upperbound on the sum rate of the MIMO BC. I.
RaWMS  Random Walk based Lightweight Membership Service for Wireless Ad Hoc Networks
, 2008
"... This paper presents RaWMS, a novel lightweight random membership service for ad hoc networks. The service provides each node with a partial uniformly chosen view of network nodes. Such a membership service is useful, e.g., in data dissemination algorithms, lookup and discovery services, peer samplin ..."
Abstract

Cited by 37 (8 self)
 Add to MetaCart
This paper presents RaWMS, a novel lightweight random membership service for ad hoc networks. The service provides each node with a partial uniformly chosen view of network nodes. Such a membership service is useful, e.g., in data dissemination algorithms, lookup and discovery services, peer sampling services, and complete membership construction. The design of RaWMS is based on a novel reverse random walk (RW) sampling technique. The paper includes a formal analysis of both the reverse RW sampling technique and RaWMS and verifies it through a detailed simulation study. In addition, RaWMS is compared both analytically and by simulations with a number of other known methods such as flooding and gossipbased techniques.
Moment Explosions and Stationary Distributions in Affine Diffusion Models
, 2007
"... Many of the most widely used models in finance fall within the affine family of diffusion processes. The affine family combines modeling flexibility with substantial tractability, particularly through transform analysis; these models are used both for econometric modeling and for pricing and hedging ..."
Abstract

Cited by 14 (1 self)
 Add to MetaCart
(Show Context)
Many of the most widely used models in finance fall within the affine family of diffusion processes. The affine family combines modeling flexibility with substantial tractability, particularly through transform analysis; these models are used both for econometric modeling and for pricing and hedging of derivative securities. We analyze the tail behavior, the range of finite exponential moments, and the convergence to stationarity in affine models, focusing on the class of canonical models defined by Dai and Singleton [9]. We show that these models have limiting stationary distributions and characterize these limits. We show that the tails of both the transient and stationary distributions of these models are necessarily exponential or Gaussian; in the nonGaussian case, we characterize the tail decay rate for any linear combination of factors. We also give necessary and sufficient conditions for a linear combination of factors to be Gaussian. Our results follow from an investigation into the stability properties of the systems of ordinary differential equations associated with affine diffusions. 1
Coclustering for directed graphs; the stochastic coblockmodel and a spectral algorithm
, 2012
"... Communities of highly connected actors form an essential feature in the structure of several empirical directed and undirected networks. However, compared to the amount of research on clustering for undirected graphs, there is relatively little understanding of clustering in directed networks. Th ..."
Abstract

Cited by 12 (1 self)
 Add to MetaCart
Communities of highly connected actors form an essential feature in the structure of several empirical directed and undirected networks. However, compared to the amount of research on clustering for undirected graphs, there is relatively little understanding of clustering in directed networks. This paper extends the spectral clustering algorithm to directed networks in a way that coclusters or biclusters the rows and columns of a graph Laplacian. Coclustering leverages the increased complexity of asymmetric relationships to gain new insight into the structure of the directed network. To understand this algorithm and to study its asymptotic properties in a canonical setting, we propose the Stochastic CoBlockmodel to encode coclustering structure. This is the first statistical model of coclustering and it is derived using the concept of stochastic equivalence that motivated the original Stochastic Blockmodel. Although directed spectral clustering is not derived from the Stochastic CoBlockmodel, we show that, asymptotically, the algorithm can estimate the blocks in a high dimensional asymptotic setting in which the number of blocks grows with the number of nodes. The algorithm, model, and asymptotic results can all be extended to bipartite graphs.
On Fast Dropout and its Applicability to Recurrent Networks
"... Recurrent Neural Networks (RNNs) are rich models for the processing of sequential data. Recent work on advancing the state of the art has been focused on the optimization or modelling of RNNs, mostly motivated by adressing the problems of the vanishing and exploding gradients. The control of overfi ..."
Abstract

Cited by 6 (4 self)
 Add to MetaCart
(Show Context)
Recurrent Neural Networks (RNNs) are rich models for the processing of sequential data. Recent work on advancing the state of the art has been focused on the optimization or modelling of RNNs, mostly motivated by adressing the problems of the vanishing and exploding gradients. The control of overfitting has seen considerably less attention. This paper contributes to that by analyzing fast dropout, a recent regularization method for generalized linear models and neural networks from a backpropagation inspired perspective. We show that fast dropout implements a quadratic form of an adaptive, perparameter regularizer, which rewards large weights in the light of underfitting, penalizes them for overconfident predictions and vanishes at minima of an unregularized training loss. The derivatives of that regularizer are exclusively based on the training error signal. One consequence of this is the absence of a global weight attractor, which is particularly appealing for RNNs, since the dynamics are not biased towards a certain regime. We positively test the hypothesis that this improves the performance of RNNs on four musical data sets. 1
CORPORATE NETWORKS AND PEER EFFECTS IN FIRM POLICIES ∗
, 2011
"... JOB MARKET PAPER This paper identifies the effect of corporate networks on firms ’ financial investment and executive pay decisions. Corporate networks arise through board interlocks, which provide a frequent and important channel for nonmarket interactions amongst firms. Using panel data for all p ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
JOB MARKET PAPER This paper identifies the effect of corporate networks on firms ’ financial investment and executive pay decisions. Corporate networks arise through board interlocks, which provide a frequent and important channel for nonmarket interactions amongst firms. Using panel data for all publicly traded companies in India I estimate peer effects in firm policies, defining each firm’s reference group as the set of all other firms with whom it shares one or more directors. Identification of dynamic network peer effects, which derive from endogenous associations, is achieved by exploiting natural breaks in network evolution that exogenously change the composition of peers. These breaks occur as a result of local network shocks – death or retirement of shared directors – that are stochastic and external to the network formation process. I find significant network peer effects that are positively associated with firms ’ investment strategy and executive compensation. I also explore heterogeneity in peer effects by distinguishing between network peers who belong to the same industry from those that do not, and find a greater effect of acrossindustry network peers.
Sparse PCA through Lowrank Approximations
"... We introduce a novel algorithm that computes the ksparse principal component of a positive semidefinite matrix A. Our algorithm is combinatorial and operates by examining a discrete set of special vectors lying in a lowdimensional eigensubspace of A. We obtain provable approximation guarantees th ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
(Show Context)
We introduce a novel algorithm that computes the ksparse principal component of a positive semidefinite matrix A. Our algorithm is combinatorial and operates by examining a discrete set of special vectors lying in a lowdimensional eigensubspace of A. We obtain provable approximation guarantees that depend on the spectral profile of the matrix: the faster the eigenvalue decay, the better the quality of our approximation. For example, if the eigenvalues of A follow a powerlaw decay, we obtain a polynomialtime approximation algorithm for any desired accuracy. We implement our algorithm and test it on multiple artificial and real data sets. Due to a feature elimination step, it is possible to perform sparse PCA on data sets consisting of millions of entries in a few minutes. Our experimental evaluation shows that our scheme is nearly optimal while finding very sparse vectors. We compare to the prior state of the art and show that our scheme matches or outperforms previous algorithms in all tested data sets. 1.
Boosting Video Popularity through Recommendation Systems
 In DBSocial
, 2011
"... While search engines are the major sources of content discovery on online content providers and ecommerce sites, their capability is limited since textual descriptions cannot fully describe the semantic of content such as videos. Recommendation systems are now widely used in online content provider ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
While search engines are the major sources of content discovery on online content providers and ecommerce sites, their capability is limited since textual descriptions cannot fully describe the semantic of content such as videos. Recommendation systems are now widely used in online content providers and ecommerce sites and play an important role in discovering content. In this paper, we describe how one can boost the popularity of a video through the recommendation system in YouTube. We present a model that captures the view propagation between videos through the recommendation linkage and quantifies the influence that a video has on the popularity of another video. Furthermore, we identify that the similarity in titles and tags is an important factor in forming the recommendation linkage between videos. This suggests that one can manipulate the metadata of a video to boost its popularity.