Results 1 - 10
of
21
Near-optimal sensor placements in gaussian processes
- In ICML
, 2005
"... When monitoring spatial phenomena, which can often be modeled as Gaussian processes (GPs), choosing sensor locations is a fundamental task. There are several common strategies to address this task, for example, geometry or disk models, placing sensors at the points of highest entropy (variance) in t ..."
Abstract
-
Cited by 333 (34 self)
- Add to MetaCart
(Show Context)
When monitoring spatial phenomena, which can often be modeled as Gaussian processes (GPs), choosing sensor locations is a fundamental task. There are several common strategies to address this task, for example, geometry or disk models, placing sensors at the points of highest entropy (variance) in the GP model, and A-, D-, or E-optimal design. In this paper, we tackle the combinatorial optimization problem of maximizing the mutual information between the chosen locations and the locations which are not selected. We prove that the problem of finding the configuration that maximizes mutual information is NP-complete. To address this issue, we describe a polynomial-time approximation that is within (1 − 1/e) of the optimum by exploiting the submodularity of mutual information. We also show how submodularity can be used to obtain online bounds, and design branch and bound search procedures. We then extend our algorithm to exploit lazy evaluations and local structure in the GP, yielding significant speedups. We also extend our approach to find placements which are robust against node failures and uncertainties in the model. These extensions are again associated with rigorous theoretical approximation guarantees, exploiting the submodularity of the objective function. We demonstrate the advantages of our approach towards optimizing mutual information in a very extensive empirical study on two real-world data sets.
Near-optimal nonmyopic value of information in graphical models
- In Annual Conference on Uncertainty in Artificial Intelligence
"... A fundamental issue in real-world systems, such as sensor networks, is the selection of observations which most effectively reduce uncertainty. More specifically, we address the long standing problem of nonmyopically selecting the most informative subset of variables in a graphical model. We present ..."
Abstract
-
Cited by 142 (25 self)
- Add to MetaCart
(Show Context)
A fundamental issue in real-world systems, such as sensor networks, is the selection of observations which most effectively reduce uncertainty. More specifically, we address the long standing problem of nonmyopically selecting the most informative subset of variables in a graphical model. We present the first efficient randomized algorithm providing a constant factor (1 − 1/e − ε) approximation guarantee for any ε> 0 with high confidence. The algorithm leverages the theory of submodular functions, in combination with a polynomial bound on sample complexity. We furthermore prove that no polynomial time algorithm can provide a constant factor approximation better than (1 − 1/e) unless P = NP. Finally, we provide extensive evidence of the effectiveness of our method on two complex real-world datasets. 1
Bayesian Treed Gaussian Process Models with an Application to Computer Modeling
- Journal of the American Statistical Association
, 2007
"... This paper explores nonparametric and semiparametric nonstationary modeling methodologies that couple stationary Gaussian processes and (limiting) linear models with treed partitioning. Partitioning is a simple but effective method for dealing with nonstationarity. Mixing between full Gaussian proce ..."
Abstract
-
Cited by 87 (19 self)
- Add to MetaCart
This paper explores nonparametric and semiparametric nonstationary modeling methodologies that couple stationary Gaussian processes and (limiting) linear models with treed partitioning. Partitioning is a simple but effective method for dealing with nonstationarity. Mixing between full Gaussian processes and simple linear models can yield a more parsimonious spatial model while significantly reducing computational effort. The methodological developments and statistical computing details which make this approach efficient are described in detail. Illustrations of our model are given for both synthetic and real datasets. Key words: recursive partitioning, nonstationary spatial model, nonparametric regression, Bayesian model averaging 1
Active Learning For Identifying Function Threshold Boundaries
"... We present an efficient algorithm to actively select queries for learning the boundaries separating a function domain into regions where the function is above and below a given threshold. We develop experiment selection methods based on entropy, misclassification rates, variance, and their combi ..."
Abstract
-
Cited by 13 (5 self)
- Add to MetaCart
We present an efficient algorithm to actively select queries for learning the boundaries separating a function domain into regions where the function is above and below a given threshold. We develop experiment selection methods based on entropy, misclassification rates, variance, and their combinations, and show how they perform on a number of data sets. We then show how these algorithms are used to determine simultaneously valid 1 - # confidence intervals for seven cosmological parameters. Experimentation shows that the algorithm reduces the computation necessary for the parameter estimation problem by an order of magnitude.
Near-Optimal Sensor Placement for Linear Inverse Problems
"... A classic problem is the estimation of a set of parameters from measurements collected by few sensors. The number of sensors is often limited by physical or economical constraints and their placement is of fundamental importance to obtain accurate estimates. Unfortunately, the selection of the opti ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
A classic problem is the estimation of a set of parameters from measurements collected by few sensors. The number of sensors is often limited by physical or economical constraints and their placement is of fundamental importance to obtain accurate estimates. Unfortunately, the selection of the optimal sensor locations is intrinsically combinatorial and the available approximation algorithms are not guaranteed to generate good solutions in all cases of interest. We propose FrameSense, a greedy algorithm for the selection of optimal sensor locations. The core cost function of the algorithm is the frame potential, a scalar property of matrices that measures the orthogonality of its rows. Notably, FrameSense is the first algorithm that is near-optimal in terms of mean square error, meaning that its solution is always guaranteed to be close to the optimal one. Moreover, we show with an extensive set of numerical experiments that FrameSense achieves the state-of-the-art performance while having the lowest computational cost, when compared to other greedy methods.
Adaptive design of supercomputer experiments
, 2006
"... Computer experiments are often performed to allow modeling of a response surface of a physical experi-ment that can be too costly or difficult to run except using a simulator. Running the experiment over a dense grid can be prohibitively expensive, yet running over a sparse design chosen in advance ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
(Show Context)
Computer experiments are often performed to allow modeling of a response surface of a physical experi-ment that can be too costly or difficult to run except using a simulator. Running the experiment over a dense grid can be prohibitively expensive, yet running over a sparse design chosen in advance can result in obtaining insufficient information in parts of the space, particularly when the surface is nonstation-ary. We propose an approach which automatically explores the space while simultaneously fitting the response surface, using predictive uncertainty to guide subsequent experimental runs. The newly devel-oped Bayesian treed Gaussian process is used as the surrogate model, and a fully Bayesian approach allows explicit nonstationary measures of uncertainty. Our adaptive sequential design framework has been developed to cope with an asynchronous, random, agent-based supercomputing environment. We take a hybrid approach which melds optimal strategies from the statistics literature with flexible strate-gies from the active learning literature. The merits of this approach are borne out in several examples, including the motivating example of a computational fluid dynamics simulation of rocket booster. Key words: nonstationary spatial model, treed partitioning, sequential design, active learning 1
Process Driven Spatial and Network Aggregation for Pandemic Response
- In Proc. SIAM DM 2006 Workshop on Spatial Data Mining
, 2006
"... Phase transitions in measures of cluster connectedness may be used to identify critical points in the propagation of an epidemic. These critical points reflect order of magnitude shifts in network properties and thus define appropriate regions for aggregation in the evolving socio-temporal portrait. ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
(Show Context)
Phase transitions in measures of cluster connectedness may be used to identify critical points in the propagation of an epidemic. These critical points reflect order of magnitude shifts in network properties and thus define appropriate regions for aggregation in the evolving socio-temporal portrait. Analysis of pre- and post- transitional images at these critical points can define principle corridors of propagation and establish the appropriate local scale (aggregate level) for resource allocation strategies. Semi-supervised learning techniques based on Gaussian random fields enable prediction of infectious spread to unlabeled entities, and projections of disease propagation can inform allocation strategies for intelligent targeting of response resources to the most vulnerable locations in the unlabeled network. 1
Active Learning for Interactive Visualization
"... Many automatic visualization methods have been proposed. However, a visualization that is automatically generated might be different to how a user wants to arrange the objects in visualization space. By allowing users to relocate objects in the embedding space of the visualization, they can adjust t ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
(Show Context)
Many automatic visualization methods have been proposed. However, a visualization that is automatically generated might be different to how a user wants to arrange the objects in visualization space. By allowing users to relocate objects in the embedding space of the visualization, they can adjust the visualization to their preference. We propose an active learning framework for interactive visualization which selects objects for the user to relocate so that they can obtain their desired visualization by re-locating as few as possible. The framework is based on an information theoretic criterion, which favors objects that reduce the uncertainty of the visualization. We present a concrete application of the proposed framework to the Laplacian eigenmap visualization method. We demonstrate experimentally that the proposed framework yields the desired visualization with fewer user interactions than existing methods. 1
Actively Learning Level-Sets of Composite Functions
"... Scientists frequently have multiple types of experiments and data sets on which they can test the validity of their parameterized models and locate plausible regions for the model parameters. By examining multiple data sets, scientists can obtain inferences which typically are much more informative ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Scientists frequently have multiple types of experiments and data sets on which they can test the validity of their parameterized models and locate plausible regions for the model parameters. By examining multiple data sets, scientists can obtain inferences which typically are much more informative than the deductions derived from each of the data sources independently. Several standard data combination techniques result in target functions which are a weighted sum of the observed data sources. Thus, computing constraints on the plausible regions of the model parameter space can be formulated as finding a level set of a target function which is the sum of observable functions. We propose an active learning algorithm for this problem which selects both a a sample (from the parameter space) and an observable function upon which to compute the next sample. Empirical tests on synthetic functions and on real data for an eight parameter cosmological model show that our algorithm significantly reduces the number of samples required to identify the desired level-set. 1.
A Charging and Storage Infrastructure Design for Electric Vehicles
"... Ushered by recent developments in various areas of science and technology, modern energy systems are going to be an inevitable part of our societies. Smart grids are one of these modern systems that have attracted many research activities in recent years. Before utilizing the next generation of smar ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Ushered by recent developments in various areas of science and technology, modern energy systems are going to be an inevitable part of our societies. Smart grids are one of these modern systems that have attracted many research activities in recent years. Before utilizing the next generation of smart grids, we should have a comprehensive understanding of the interdependent energy networks and processes. Nextgeneration energy systems networks cannot be effectively designed, analyzed, and controlled in isolation from the social, economic, sensing, and control contexts in which they operate. In this paper we present a novel framework to support charging and storage infrastructure design for electric vehicles. We develop coordinated clustering techniques to work with network models of urban environments to aid in placement of charging stations for an electrical vehicle deployment scenario. Furthermore, we evaluate the network before and after the deployment of charging stations, to recommend the installation of appropriate storage units to overcome the extra load imposed on the network by the charging stations. We demonstrate the multiple factors that can be simultaneously leveraged in our framework in order to achieve practical urban deployment. Our ultimate goal is to help realize sustainable energy system management in urban electrical infrastructure by modeling and analyzing networks of interactions between electric systems and urban