Results 1  10
of
122
The jackknifea review
 Biometrika
, 1974
"... Interleukin (IL)33 is a new member of the IL1 superfamily of cytokines that is expressed by mainly stromal cells, such as epithelial and endothelial cells, and its expression is upregulated following proinflammatory stimulation. IL33 can function both as a traditional cytokine and as a nuclear f ..."
Abstract

Cited by 101 (0 self)
 Add to MetaCart
(Show Context)
Interleukin (IL)33 is a new member of the IL1 superfamily of cytokines that is expressed by mainly stromal cells, such as epithelial and endothelial cells, and its expression is upregulated following proinflammatory stimulation. IL33 can function both as a traditional cytokine and as a nuclear factor regulating gene transcription. It is thought to function as an ‘alarmin ’ released following cell necrosis to alerting the immune system to tissue damage or stress. It mediates its biological effects via interaction with the receptors ST2 (IL1RL1) and IL1 receptor accessory protein (IL1RAcP), both of which are widely expressed, particularly by innate immune cells and T helper 2 (Th2) cells. IL33 strongly induces Th2 cytokine production from these cells and can promote the pathogenesis of Th2related disease such as asthma, atopic dermatitis and anaphylaxis. However, IL33 has shown various protective effects in cardiovascular diseases such as atherosclerosis, obesity, type 2 diabetes and cardiac remodeling. Thus, the effects of IL33 are either pro or antiinflammatory depending on the disease and the model. In this review the role of IL33 in the inflammation of several disease pathologies will be discussed, with particular emphasis on recent advances.
Multicriteria Reinforcement Learning
, 1998
"... We consider multicriteria sequential decision making problems where the vectorvalued evaluations are compared by a given, fixed total ordering. Conditions for the optimality of stationary policies and the Bellman optimality equation are given. The analysis requires special care as the topology int ..."
Abstract

Cited by 34 (0 self)
 Add to MetaCart
We consider multicriteria sequential decision making problems where the vectorvalued evaluations are compared by a given, fixed total ordering. Conditions for the optimality of stationary policies and the Bellman optimality equation are given. The analysis requires special care as the topology introduced by pointwise convergence and the ordertopology introduced by the preference order are in general incompatible. Reinforcement learning algorithms are proposed and analyzed. Preliminary computer experiments confirm the validity of the derived algorithms. It is observed that in the mediumterm multicriteria RL often converges to better solutions (measured by the first criterion) than their singlecriterion counterparts. These type of multicriteria problems are most useful when there are several optimal solutions to a problem and one wants to choose the one among these which is optimal according to another fixed criterion. Example applications include alternating games, when in addition...
Importance sampling for reinforcement learning with multiple objectives
, 2001
"... OFTECHNOLOGY hairman, ..."
ParetoBased Multiobjective Machine Learning: An Overview and Case Studies
, 2008
"... Machine learning is inherently a multiobjective task. Traditionally, however, either only one of the objectives is adopted as the cost function or multiple objectives are aggregated to a scalar cost function. This can be mainly attributed to the fact that most conventional learning algorithms can o ..."
Abstract

Cited by 34 (1 self)
 Add to MetaCart
Machine learning is inherently a multiobjective task. Traditionally, however, either only one of the objectives is adopted as the cost function or multiple objectives are aggregated to a scalar cost function. This can be mainly attributed to the fact that most conventional learning algorithms can only deal with a scalar cost function. Over the last decade, efforts on solving machine learning problems using the Paretobased multiobjective optimization methodology have gained increasing impetus, particularly due to the great success of multiobjective optimization using evolutionary algorithms and other populationbased stochastic search methods. It has been shown that Paretobased multiobjective learning approaches are more powerful compared to learning algorithms with a scalar cost function in addressing various topics of machine learning, such as clustering, feature selection, improvement of generalization ability, knowledge extraction, and ensemble generation. One common benefit of the different multiobjective learning approaches is that a deeper insight into the learning problem can be gained by analyzing the Pareto front composed of multiple Paretooptimal solutions. This paper provides an overview of the existing research on multiobjective machine learning, focusing on supervised learning. In addition, a number of case studies are provided to illustrate the major benefits of the Paretobased approach to machine learning, e.g., how to identify interpretable models and models that can generalize on unseen data from the obtained Paretooptimal solutions. Three approaches to Paretobased multiobjective ensemble generation are compared and discussed in detail. Finally, potentially interesting topics in multiobjective machine learning are suggested.
A Bayesian Approach for Learning and Planning in Partially Observable Markov Decision Processes
"... Bayesian learning methods have recently been shown to provide an elegant solution to the explorationexploitation tradeoff in reinforcement learning. However most investigations of Bayesian reinforcement learning to date focus on the standard Markov Decision Processes (MDPs). The primary focus of th ..."
Abstract

Cited by 19 (2 self)
 Add to MetaCart
Bayesian learning methods have recently been shown to provide an elegant solution to the explorationexploitation tradeoff in reinforcement learning. However most investigations of Bayesian reinforcement learning to date focus on the standard Markov Decision Processes (MDPs). The primary focus of this paper is to extend these ideas to the case of partially observable domains, by introducing the BayesAdaptive Partially Observable Markov Decision Processes. This new framework can be used to simultaneously (1) learn a model of the POMDP domain through interaction with the environment, (2) track the state of the system under partial observability, and (3) plan (near)optimal sequences of actions. An important contribution of this paper is to provide theoretical results showing how the model can be finitely approximated while preserving good learning performance. We present approximate algorithms for belief tracking and planning in this model, as well as empirical results that illustrate how the model estimate and agent’s return improve as a function of experience. Keywords: processes reinforcement learning, Bayesian inference, partially observable Markov decision 1.
Dynamic Preferences in MultiCriteria Reinforcement Learning
 In Proceedings of ICML05
, 2005
"... The current framework of reinforcement learning is based on maximizing the expected returns based on scalar rewards. But in many real world situations, tradeoffs must be made among multiple objectives. Moreover, the agent’s preferences between different objectives may vary with time. In this paper, ..."
Abstract

Cited by 19 (3 self)
 Add to MetaCart
(Show Context)
The current framework of reinforcement learning is based on maximizing the expected returns based on scalar rewards. But in many real world situations, tradeoffs must be made among multiple objectives. Moreover, the agent’s preferences between different objectives may vary with time. In this paper, we consider the problem of learning in the presence of timevarying preferences among multiple objectives, using numeric weights to represent their importance. We propose a method that allows us to store a finite number of policies, choose an appropriate policy for any weight vector and improve upon it. The idea is that although there are infinitely many weight vectors, they may be wellcovered by a small number of optimal policies. We show this empirically in two domains: a version of the Buridan’s ass problem and network routing. 1.
Optimal approximate dynamic programming algorithms for a general class of storage problems
, 2007
"... informs doi 10.1287/moor.1080.0360 ..."
(Show Context)
Reinforcement Learning with Bounded Risk
 In Proceedings of the Eighteenth International Conference on Machine Learning
, 2001
"... In this paper, we consider nite MDPs with fatal states. We dene the risk under a policy as the probability of entering a fatal state, which is dierent to the notion of risk normally used in DP and RL (most often regarding the variance of the return). We consider the problem of nding optimal po ..."
Abstract

Cited by 18 (2 self)
 Add to MetaCart
In this paper, we consider nite MDPs with fatal states. We dene the risk under a policy as the probability of entering a fatal state, which is dierent to the notion of risk normally used in DP and RL (most often regarding the variance of the return). We consider the problem of nding optimal policies with bounded risk, i.e. where the risk is smaller than some user specied threshold !, and formalize it as a constrained MDP with two innite horizon criteria { a discounted one for the value of a state and an undiscounted criterion for the risk. We dene a heuristic, model free reinforcement learning algorithm that nds good deterministic policies for the constrained problem. The algorithm is based on an abstract ordering of the multidimensional return space. It uses a weighted formulation of the problem. The internal weight parameter is adjusted by an heuristic optimization algorithm. 1.
A Robust Geometric Approach to MultiCriterion Reinforcement Learning
 Journal of Machine Learning Research
, 2004
"... We consider the problem of reinforcement learning in a dynamic environment, where the learning objective is defined in terms of multiple reward functions of the average reward type. The environment is initially unknown, and furthermore may be affected by the actions of other agents, which are observ ..."
Abstract

Cited by 18 (1 self)
 Add to MetaCart
We consider the problem of reinforcement learning in a dynamic environment, where the learning objective is defined in terms of multiple reward functions of the average reward type. The environment is initially unknown, and furthermore may be affected by the actions of other agents, which are observed but cannot be predicted in advance. We model this situation through a stochastic (Markov) game model, between the learning agent and an arbitrary player, with vectorvalued rewards. State recurrence conditions are imposed throughout. The objective of the learning agent is to have its longterm average reward vector belong to a desired target set. Starting with a given target set, we devise learning algorithms to achieve this task. These algorithms rely on learning algorithms for appropriately defined scalar rewards, together with the geometric insight of the theory of approachability for stochastic games. We then address the more general problem where the target set itself may depend on the model parameters, and hence is not known in advance to the learning agent. A particular case which falls into this framework is that of stochastic games with average reward constraints. Further specialization provides a reinforcement learning algorithm for constrained Markov decision processes. Some basic examples are provided to illustrate these results.