Deeply-debiased off-policy interval estimation
Shi, C.
, Wan, R., Chernozhukov, V. & Song, R.
(2021-07-18 - 2021-07-24)
Deeply-debiased off-policy interval estimation
[Paper]. International Conference on Machine Learning, Online.
Off-policy evaluation learns a target policy’s value with a historical dataset generated by a different behavior policy. In addition to a point estimate, many applications would benefit significantly from having a confidence interval (CI) that quantifies the uncertainty of the point estimate. In this paper, we propose a novel deeply-debiasing procedure to construct an efficient, robust, and flexible CI on a target policy’s value. Our method is justified by theoretical results and numerical experiments. A Python implementation of the proposed procedure is available at https://github.com/RunzheStat/D2OPE.
| Item Type | Conference or Workshop Item (Paper) |
|---|---|
| Copyright holders | © 2021 The Authors |
| Departments | LSE > Academic Departments > Statistics |
| Date Deposited | 24 Jun 2021 |
| Acceptance Date | 08 May 2021 |
| URI | https://researchonline.lse.ac.uk/id/eprint/110920 |
Explore Further
- https://www.lse.ac.uk/Statistics/People/Dr-Chengchun-Shi (Author)
- https://icml.cc/ (Official URL)
ORCID: https://orcid.org/0000-0001-7773-2099