Two-way deconfounder for off-policy evaluation in causal reinforcement learning

Yu, Shuguang; Fang, Shuxing; Peng, Ruixin; Qi, Zhengling; Zhou, Fan; and Shi, Chengchun

(2024) Two-way deconfounder for off-policy evaluation in causal reinforcement learning. In: 38th Annual Conference on Neural Information Processing Systems, 2024-12-10 - 2024-12-15, Vancouver Convention Center,Vancouver,Canada,CAN. (In press)

Copy

This paper studies off-policy evaluation (OPE) in the presence of unmeasured confounders. Inspired by the two-way fixed effects regression model widely used in the panel data literature, we propose a two-way unmeasured confounding assumption to model the system dynamics in causal reinforcement learning and develop a two-way deconfounder algorithm that devises a neural tensor network to simultaneously learn both the unmeasured confounders and the system dynamics, based on which a model-based estimator can be constructed for consistent policy value estimation. We illustrate the effectiveness of the proposed estimator through theoretical results and numerical experiments.

Item Type	Conference or Workshop Item (Paper)
Departments	Statistics
Date Deposited	21 Nov 2024 17:06
Acceptance Date	2024-09-25
URI	https://researchonline.lse.ac.uk/id/eprint/126146

Explore Further

mail

Request Copy

picture_as_pdf
subject: Accepted Version
lock_clock: Restricted to Repository staff only until 1 January 2100

Request Copy

Atom

BibTeX

OpenURL ContextObject in Span

OpenURL ContextObject

Dublin Core

MPEG-21 DIDL

Data Cite XML

EndNote

HTML Citation

METS

MODS

RIOXX2 XML

Reference Manager

Refer

ASCII Citation

Export

Downloads