Testing stationarity and change point detection in reinforcement learning

Li, Mengbing; Shi, ChengchunORCID logo; Wu, Zhenke; and Fryzlewicz, PiotrORCID logo (2025) Testing stationarity and change point detection in reinforcement learning Annals of Statistics. ISSN 0090-5364 (In press)
Copy

We consider reinforcement learning (RL) in possibly nonstationary environments. Many existing RL algorithms in the literature rely on the stationarity assumption that requires the state transition and reward functions to be constant over time. However, this assumption is restrictive in practice and is likely to be violated in a number of applications, including traffic signal control, robotics and mobile health. In this paper, we develop a modelfree test to assess the stationarity of the optimal Q-function based on pre-collected historical data, without additional online data collection. Based on the proposed test, we further develop a change point detection method that can be naturally coupled with existing state-of-the-art RL methods designed in stationary environments for online policy optimization in nonstationary environments. The usefulness of our method is illustrated by theoretical results, simulation studies, and a real data example from the 2018 Intern Health Study. A Python implementation of the proposed procedure is publicly available at https://github.com/limengbinggz/CUSUM-RL.

mail Request Copy

picture_as_pdf
subject
Accepted Version
lock_clock
Restricted to Repository staff only until 1 January 2100
Available under Creative Commons: Attribution 4.0

Request Copy

Atom BibTeX OpenURL ContextObject in Span OpenURL ContextObject Dublin Core MPEG-21 DIDL Data Cite XML EndNote HTML Citation METS MODS RIOXX2 XML Reference Manager Refer ASCII Citation
Export

Downloads