A robust test for the stationarity assumption in sequential decision making

Wang, J., Shi, C.

& Wu, Z. (2023). A robust test for the stationarity assumption in sequential decision making. Proceedings of Machine Learning Research, 36355-36379.

Copy

Reinforcement learning (RL) is a powerful technique that allows an autonomous agent to learn an optimal policy to maximize the expected return. The optimality of various RL algorithms relies on the stationarity assumption, which requires time-invariant state transition and reward functions. However, deviations from stationarity over extended periods often occur in real-world applications like robotics control, health care and digital marketing, resulting in suboptimal policies learned under stationary assumptions. In this paper, we propose a model-based doubly robust procedure for testing the stationarity assumption and detecting change points in offline RL settings with certain degree of homogeneity. Our proposed testing procedure is robust to model misspecifications and can effectively control type-I error while achieving high statistical power, especially in high-dimensional settings. Extensive comparative simulations and a real-world interventional mobile health example illustrate the advantages of our method in detecting change points and optimizing long-term rewards in high-dimensional, non-stationary environments.

Item Type	Article
Copyright holders	© 2023 The Author
Departments	LSE > Academic Departments > Statistics
Date Deposited	17 November 2023
Acceptance Date	14 April 2023
URI	https://researchonline.lse.ac.uk/id/eprint/120775

Explore Further

Shi, Chengchun

QA75 Electronic computers. Computer science

picture_as_pdf

subject: Accepted Version

Download

Downloads

View more statistics

A robust test for the stationarity assumption in sequential decision making

Explore Further

Export as