Results

**1 - 2**of**2**### Local regularity results for value functions of tug-of-war with noise and running payoff

"... ar ..."

(Show Context)
### unknown title

"... Abstract. Recent works by Gómez, Pezzo and Rossi have generalized the class of games known as tug-of-war games, which were investigated by Peres, Schramm, Sheffield and Wilson, by analyzing games where the sets of possible movements for the players are either spatially dependent or spatially and te ..."

Abstract
- Add to MetaCart

(Show Context)
Abstract. Recent works by Gómez, Pezzo and Rossi have generalized the class of games known as tug-of-war games, which were investigated by Peres, Schramm, Sheffield and Wilson, by analyzing games where the sets of possible movements for the players are either spatially dependent or spatially and temporally dependent. We further generalize this class of games by considering games with surrounding noise which is spatially and temporally dependent and where the sets of possible movements are also spatially and temporally dependent. Our generalization includes simple and natural real-world examples such as competing parties fighting on multiple fronts and general resource allocation problems. We derive the PDE for which the continuum value function of the game is a solution and we show that the PDE is an interpolation between the 'infinity Laplacian with spatial and temporal dependence' which corresponds to the movements of the players (α coefficient) and a 'weighted Laplacian', with weights arising from the variances of the noise, which corresponds to the noisy environment (β coefficient). When ∇(x, t)u = 0 the PDE takes the form α c − 1 c I (x,t) (∇u(x, t)) + c + 1 2 ut(x, t) − 1 2 D 2 u(x, t)J (x,t) (∇u(x, t)), J (x,t) (∇u(x, t)) domain and (0, T ] with T ∈ R is a 'temporal' domain. Introduction An interesting problem introduced by David Aldous [1] and partially solved by McKean and Shepp [7] is the following. Suppose there are K one-dimensional Brownian motions processes starting at x = 1 which we can control by adding to the K processes drifts which sum up to 1. What is the optimal policy (division of drift) if the objective is to maximize the probability that the K processes never hit 0? What's the optimal policy if the objective is instead to maximize the expected number of processes that never hit 0? McKean and Shepp has addressed the case K = 2 1 but in general the problem is still open. A natural zero-sum game theory variant of this problem is to place K one-dimensional Brownian motions processes at 0 and consider two players, I and II, each having a drift of magnitude 1, whose objective is to maximize the number of processes which hit 1 (Player I's objective) or -1 (Player II's objective). The problem now would be to find the set of Nash equilibria and the associated policies for the players. An even more general problem would be to make both the set of actions available to the players and the Brownian motion dependent on the current state of the game as well as on current time in the game. This variant of the problem has many natural real-world applications in the context of solving resource allocation problems for 'multi-theater conflicts'. A war between two countries that takes place along different fronts is one such an example where the resources need to be divided among the different fronts and the set of actions (deploying forces, etc.) available to the two countries depends on the current time and state of the conflict. One way to approach the above problem is by using 'tug-of-war' type of games. In the simple version of the ε-tug-of-war game a token is placed at some initial position in a domain D. At each stage of the game two players, I and II, flip a fair coin to decide whose turn it is and the winner moves the token to any point in a ball of radius ε centered at the current position of the token. The game ends when the token reaches the boundary of D at which point Player II pays Player I an amount of money depending on some 'boundary function' and the position on the boundary where the token exited D. The game is named that way because players 'tug' in different directions hoping to get to boundary points where they will receive their best payoffs (the best payoff of Player I is the worst payoff of player II and vice verse). In [8] Peres, Schramm, Sheffield and Wilson showed that the solution of the equation ∆ ∞ u = 0 where ∆ ∞ is the operator known as the infinity Laplacian defined by Tug-of-war games are applicable to our problem since the sets of available movements for the players, which depended on position and time, model the sets of possible resource allocations at a specific place and time. For example, to model a game with two processes (K = 2) where the entire drift must be given to exactly one process (so the only choice for a player is which process should receive the drift) we can use a square as a domain where moving along the x-axis corresponds to giving the entire drift the one process and moving along the y-axis corresponds to giving the entire drift to the other process. To account for the random movement of the particles we need to incorporate noise which is spatially and temporally dependent into the model. The work of Pezzo and Rossi in [11] enables us to consider action sets which are spatially and temporally dependent but only in the context of games without noise. In this work we further generalize [11] carry over into our game but the proof of the limit PDE doesn't carry over in a straight-forward way since the convergence of the value functions of the ε-games becomes more complicated. We show that the limit PDE of our game is an interpolation between the PDE found in The organization of the paper is as follows. In Section 2 we define our tug-of-war game, in Section 3 we recite some results needed for our main proof and in Section 4 we prove the main theorem of the paper, Theorem 4.1. and for η > 0 we define a strip around the boundary by where bounded Borel function; we call F the final payoff function. The ε-tug-of-war game is a zero-sum game which is played between two players, Player I and Player II. The game is played in the following way. At the beginning of the game we fix 0 < ε < η and place a token at a point (x 0 , t 0 ) ∈ Ω T . Then the players toss a biased coin which is 'heads' with probability α and 'tails' with probability β where α + β = 1. If the coin lands on 'heads' the players flip a fair coin and the winner picks a new game state (x 1 , t 1 ) in a set A ε (x 0 , t 0 ) (this set, which is defined below, depends on the position (x 0 , t 0 ) and on ε). If the coin lands on 'tails' a new game state (x 1 , t 1 ) is chosen randomly from A ε (x 0 , t 0 ) according to a probability measure µ ε x 0 ,t 0 defined below. Then the biased coin is tossed again and the game continues in this way until the token hits the boundary strip Γ ε . At the end of the game, Player II pays Player I the amount given by the final payoff function F so Player I earns F (x τ , t τ ) and Player II earns −F (x τ , t τ ), where τ is a stopping time equals to the number of rounds that took place before the game ended. We have that 0 < τ < +∞, see Remark (2.1). The definition of the game gives rise to a sequence of random variables which are the game states (x 0 , t 0 ), (x 1 , t 1 ), ..., (x τ , t τ ) depending on the coin tosses, the strategies of the players and the probability measures of the noise. A strategy S I for Player I is defined as a collection of measurable mappings S I = {S k I } τ k=1 such that the next game position is if the biased coin was 'heads' and Player I won the fair coin toss, given the partial history Similarly, Player II plays according to the strategy S II . The next according to a probability distribution p(·|(x 0 , t 0 ), (x 1 , t 1 ), ..., (x k , t k )) which, in our case, is given by the tosses of the biased and fair coins and the probability measures of the noise. The point (x 0 , t 0 ), the domain Ω T and the strategies S I and S II determine a unique probability measure P . Thus, the expected payoff of Player I is given by E The ε-value for Player I, when starting from (x 0 , t 0 ), is then defined as while the ε-value for Player II is given by To define the sets A ε (x, t), the possible movements of the ε-game at the points (x, t), we use the following construction. Following A3. Continuity of A(x, t) with respect to (x, t) : A4. Let ·, · be the standard inner product on R N then for every v ∈ R N \{0}, there exists a unique (z, r) ∈ A(x, t) such that min{ v, y : y ∈ π 1 (A(x, t))} = v, z . 5 We denote this point (z, r) by (J (x,t) (v), I (x,t) (v)). We have that v, J (x,t) (v) = 0 and (J (x,t) (λv), I (x,t) (λv)) = (J (x,t) (v), I (x,t) (v)) for any λ > 0 so (J (x,t) (v), I (x,t) (v)) depends only on the direction of v. Note that (−J (x,t) (v), I (x,t) (v)) ∈ A(x, t) and max{ v, y : y ∈ π 1 (A(x, t))} = v, −J (x,t) (v) . In addition, we require that J (x,t) : ∂B(0, 1) → ∂π 1 Having in hand the definition of A(x, t) we defined A ε (x, t) as We have that every r ∈ π 2 (A ε (x, t)) is of the form t + ε 2 1−c c s r − c+1 2 for some s r ∈ π 2 (A(x, t)). Note that the map r → s r is a bijection. Let x = (x 1 , ..., x N ); we define A r ε (x, t) := {(x + εy, r) : Remark 2.1. Note that by A1 and by the definition of A ε (x, t) time decreases by at least cε 2 every round of the game so τ is indeed finite. We now define the probability measures µ ε (x,t) . Let us first explain the construction of the measures informally. Every A(x, t) can be written as A(x, t) = s∈π 2 (A(x,t)) A s (x, t) × {s}. To pick a point in A(x, t) we choose a 'time' s ∈ π 2 (A(x, t)) according to a probability measure µ 2 (x,t) , (the superscript 2 stands for the temporal direction as the 2 in π 2 ) and in the 'time slice' A s (x, t) we choose a point according to a probability measure µ 1,s (x,t) (the first superscript 1 stands for the spatial direction as the 1 in π 1 and the second superscript s stands for the time slice A s (x, t)). Each µ 1,s (x,t) is a measure on R N and we construct it as a product of probability measures µ 1,s,1 (x,t) , ..., µ 1,s,N (x,t) where µ 1,s,i (x,t) is a probability measure on the subspace spanned by the basis element e i . Formally, for any (x, t) ∈ Ω T we take a family of measures {µ 1,s,i (x,t) } s∈π 2 (A(x,t)),i∈{1,...,N } each supported on π 1,i (A s (x, t)) and a measure µ 2 (x,t) supported on π 2 (A(x, t)) such that for each s ∈ π 2 (A(x, t)) and i ∈ {1, ..., N } we have (symmetry about the origin along the subspaces spanned by the e i 's), We now define a probability measure µ 1,s (x,t) on A s (x, t) by setting for every B ⊂ A s (x, t) such that B is measurable with respect to the product sigma algebra generated by {µ 1,s,i (x,t) } s∈π 2 Let µ 2 (x,t) be a measure on R such that µ 2 (x,t) (π 2 (A(x, t))) = 1. Then for every B ⊂ A(x, t) such that B is measurable with respect to the product sigma algebra generated by {µ 1,s (x,t) } s∈π 2 (A(x,t)) and µ 2 (x,t) , by Kolmogorov extension theorem there exists a probability measure µ (x,t) such that where the notation B s is clear. We make the following assumption about the measures. A5. Continuity of the measures µ 1,s,i (x,t) and µ 2 (x,t) with respect to (x, t) : Let (x ε , t ε ) → (x, t) ∈ Ω T ; then for every s ε ∈ π 2 (A(x ε , t ε )) if s ε → s then µ 1,sε,i (xε,tε) → µ 1,s,i (x,t) and µ 2 (xε,tε) → µ 2 (x,t) in the sense of weak-* convergence. Having µ (x,t) defined we now define µ ε (x,t) on A ε (x, t). Following (2.1), let f and let g (x,t),ε : π 2 (A(x, t)) → π 2 (A ε (x, t)) be defined by We set Remark 2.3. Convergence issues: In section 4 we take limits of the form where (x ε , t ε ) → (x, t) and h : To make sense of such limits we need to define a notion of convergence of sets and ensure that we can take simultaneously the limits of various quantities which depend on ε. Let (Z, d) be a compact metric space and let d(z, a) < δ for some a ∈ A} and we let 2 Z be the set of nonempty compact subsets of Z. Definition 2.4. Let A n be a sequence of subsets of Z. Then we say that A n converges to A in 2 Z and write A n → A if for every δ > 0 there exists N ∈ N such that for all n > N we have A n ⊂ A (δ) and A ⊂ A (i) If z ∈ A then there exist z n ∈ A n such that z n → z. (ii) If z n k ∈ A n k and z n k → z for some z then z ∈ A. The proof of the lemma is standard and we omit it. Note that with Z : The following lemma will aid us in computing the limits described above. Lemma 2.6. Let h be uniformly bounded, i.e. there exists M ∈ R such that for every (x, t, s) ∈ we have h(x, t, s) < M . In addition suppose that if (x n , t n ) → (x, t) then h(x n , t n , s) → h(x, t, s) uniformly in s, i.e. for every δ > 0 there exists N ∈ N such that if n > N then |h(x, t, s) − h(x n , t n , s)| < δ for all s. Then, , for every δ we can take n large enough such that Since for every δ the above equation holds for n > n δ for some n δ ∈ N it remains to show that By the uniform convergence of h with respect to s and since both h and µ 2 (xn,tn) are uniformly bounded we just need to show that h(x, t, s)µ 2 (x,t) (ds) but this follows from A5. Review of relevant results The proof of the main result in [11] is based on four steps. First, a Dynamic Programming Principle (DPP) for the ε-value functions u ε I and u ε II is proven. Second, using the DPP it is proven that the game has an ε-value function u ε which is the only function which satisfies the DPP with the given boundary values and that u ε = u ε I = u ε II . Third, it is proven that u ε → v uniformly for some function v as ε → 0 + . Finally, it is shown that v solves the PDE. Of the above four steps, using results from A similar equation holds for u ε II . The proof in [6] with Remark 2.1 applies here. Proof. The proof in [11] holds in our case. The idea of the proof is to show that the functions {u ε } ε>0 satisfy a variant of the Arzela-Ascoli lemma so they converge uniformly to a limit. A key fact in the proof is that linear functions of the form (x, t) = v, x + b are solutions of the DPP (with F (x, t) = (x, t) in Γ ε ) where v ∈ R N , b ∈ R (Remark 3.2 in The limit equation Theorem 4.1. If u ε converges uniformly to u then u solves in the viscosity sense the PDE where D 2 u and ∇u are the spatial Hessian and gradient of u respectively, u t is the partial derivative of u with respect to the time variable and G : where S N is the set of N × N symmetric matrices, E µ 2 ] is a vector whose ith entry is (x,t) )µ 2 (x,t) (ds) and Diag(M ) is a vector containing the ordered entries of the diagonal of M . J (x,t) and I (x,t) were defined in A4 andÎ (x,t) (s) is uniquely defined by Remark 4.2. The PDE in Theorem 4.1 is an interpolation between the PDE in [11] which corresponds to the tug-of-war game with spatial and temporal dependence without noise and a PDE which corresponds to a game where the movements of the game are determined only by noise. If α = 1 then the game reduces to the game in [11] so the PDE should reduce to the PDE in We first give the definition of a viscosity solution to a PDE of the form Gu = 0 for a function u and a degenerate elliptic operator G. To this end we define the upper and lower semicontinuous envelopes of G denoted by G * and G * . Let ,v,ŝ,x,t) : M − M + |v −v| + |s −ŝ| + |x −x| + |t −t| < ε}. 11 We define G * (M, v, s, x, t) = lim sup ε→0 {G( M ,v,ŝ,x,t) : ( M ,v,ŝ,x,t) ∈ C ε (M, v, s, x, t)} and G * (M, v, s, x, t) := −(−G * )(M, v, s, x, t) for every (M, v, s, x, t) ∈ S N × R N × R × Ω T . Definition 4.3. A function u ∈ C(Ω T ) is a viscosity solution to a PDE G with boundary values F if u(x, t) = F (x, t) on Γ and the following two conditions hold. (i) For every φ ∈ C 2,1 (Ω T ) such that u − φ has a strict minimum at (x 0 , t 0 ) ∈ Ω T we have G * (D 2 φ(x 0 , t 0 ), ∇φ(x 0 , t 0 ), φ t (x 0 , t 0 ), x 0 , t 0 ) ≥ 0. (ii) For every φ ∈ C 2,1 (Ω T ) such that u − φ has a strict maximum at (x 0 , t 0 ) ∈ Ω T we have G * (D 2 φ(x 0 , t 0 ), ∇φ(x 0 , t 0 ), φ t (x 0 , t 0 ), x 0 , t 0 ) ≤ 0. We now characterize the envelopes of G. Proof. The proof in [11] holds mutatis mutandis. We now turn to the proof of our main result. Proof. (Theorem 4.1) First, since u ε → u uniformly and u ε = F on Γ we have that u = F on Γ. We show that if φ ∈ C 2,1 (Ω T ) and u − φ has a strict local minimum at (x 0 , t 0 ) then G * (D 2 φ(x 0 , t 0 ), ∇φ(x 0 , t 0 ), φ t (x 0 , t 0 ), x 0 , t 0 ) ≥ 0 12 and a similar proof holds for the reverse inequality if u − φ has a strict local maximum at (x 0 , t 0 ). Since u − φ has a strict local minimum at (x 0 , t 0 ) we have u(x, t) − φ(x, t) > u(x 0 , t 0 ) − φ(x 0 , t 0 ) if (x, t) = (x 0 , t 0 ). Using the uniform convergence of u ε to u there exists a sequence (x ε , t ε ) → (x 0 , t 0 ) such that u ε (x, t) − φ(x, t) ≥ u ε (x ε , t ε ) − φ(x ε , t ε ) − o(ε 2 ) for every (x, t) in a neighborhood of (x 0 , t 0 ). Hence max (y,r)∈Aε (xε,tε) u ε (y, r) ≥ max (y,r)∈Aε (xε,tε) φ(y, r) + u ε (x ε , t ε ) − φ(x ε , t ε ) − o(ε 2 ) and similarly for the min. By the same argument,