Uniform value and decidability in ergodic blind stochastic games
We study a class of two-player zero-sum stochastic games known as blind stochastic games, where players neither observe the state nor receive any information about it during the game. A central concept for analyzing long-duration stochastic games is the uniform value. A game has a uniform value v if for every ε > 0, Player 1 (resp., Player 2) has a strategy such that, for all sufficiently large n, his average payoff over n stages is at least v − ε (resp., at most v + ε). Prior work has shown that the uniform value may not exist in general blind stochastic games. To address this, we introduce a subclass called ergodic blind stochastic games, defined by imposing an ergodicity condition on the state transitions. For this subclass, we prove the existence of the uniform value and provide an algorithm to approximate it, establishing the decidability of the approximation problem. Notably, this decidability result is novel even in the single-player setting of Partially Observable Markov Decision Processes (POMDPs). Furthermore, we show that no algorithm can compute the uniform value exactly, emphasizing the tightness of our result. Finally, we establish that the uniform value is independent of the initial belief.
| Item Type | Article |
|---|---|
| Copyright holders | © 2025 The Author(s) |
| Departments | LSE |
| Date Deposited | 01 Dec 2025 |
| Acceptance Date | 08 Nov 2025 |
| URI | https://researchonline.lse.ac.uk/id/eprint/130381 |
-
subject - Accepted Version
-
lock_clock - Restricted to Repository staff only until 1 January 2100
-
- Creative Commons: Attribution 4.0