Testing stationarity and change point detection in reinforcement learning

Li, M., Shi, C.ORCID logo, Wu, Z. & Fryzlewicz, P.ORCID logo (2025). Testing stationarity and change point detection in reinforcement learning. Annals of Statistics, 53(3), 1230 - 1256. https://doi.org/10.1214/25-aos2501
Copy

We consider reinforcement learning (RL) in possibly nonstationary environments. Many existing RL algorithms in the literature rely on the stationarity assumption that requires the state transition and reward functions to be constant over time. However, this assumption is restrictive in practice and is likely to be violated in a number of applications, including traffic signal control, robotics and mobile health. In this paper, we develop a model-free test to assess the stationarity of the optimal Q-function based on pre-collected historical data, without additional online data collection. Based on the proposed test, we further develop a change point detection method that can be naturally coupled with existing state-of-the-art RL methods designed in stationary environments for online policy optimization in nonstationary environments. The usefulness of our method is illustrated by theoretical results, simulation studies, and a real data example from the 2018 Intern Health Study. A Python implementation of the proposed procedure is publicly available at https://github.com/limengbinggz/CUSUM-RL.

picture_as_pdf

subject
Accepted Version
Creative Commons: Attribution 4.0

Download

Export as

EndNote BibTeX Reference Manager Refer Atom Dublin Core JSON Multiline CSV
Export