A review of off-policy evaluation in reinforcement learning
Uehara, M., Shi, C.
& Kallus, N.
(2025).
A review of off-policy evaluation in reinforcement learning.
Statistical Science,
Reinforcement learning (RL) is one of the most vibrant research frontiers in machine learning and has been recently applied to solve a number of challenging problems. In this paper, we primarily focus on off-policy evaluation (OPE), one of the most fundamental topics in RL. In recent years, a number of OPE methods have been developed in the statistics and computer science literature. We provide a discussion on the efficiency bound of OPE, some of the existing state-of-the-art OPE methods, their statistical properties and some other related research directions that are currently actively explored.
| Item Type | Article |
|---|---|
| Copyright holders | © 2025 The Author(s) |
| Departments | LSE > Academic Departments > Statistics |
| Date Deposited | 14 Apr 2025 |
| Acceptance Date | 18 Mar 2025 |
| URI | https://researchonline.lse.ac.uk/id/eprint/127940 |
-
subject - Accepted Version
-
lock_clock - Restricted to Repository staff only until 1 January 2100
-
- Creative Commons: Attribution 4.0
Request a copy
ORCID: https://orcid.org/0000-0001-7773-2099