Charniauski, Uladzimir; Yao Zheng

doi:10.33697/ajur.2024.116

Results

Title: Autoregressive Bandits in Near-Unstable or Unstable Environment.
Authors: Charniauski, Uladzimir; Yao Zheng
Abstract: AutoRegressive Bandits (ARBs) is a novel model of a sequential decision-making problem as an autoregressive (AR) process. In this online learning setting, the observed reward follows an autoregressive process, whose action parameters are unknown to the agent and create an AR dynamic that depends on actions the agent chooses. This study empirically demonstrates how assigning the extreme values of systemic stability indexes and other reward-governing parameters severely impairs the ARBs learning in the respective environment. We show that this algorithm suffers numerically larger regrets of higher forms under a weakly stable environment and a strictly exponential regret under the unstable environment over the considered optimization horizon. We also test ARBs against other bandit baselines in both weakly stable and unstable systems to investigate the deteriorating effect of dropping systemic stability on their performance and demonstrate the potential advantage of choosing other competing algorithms in case of weakened stability. Finally, we measure the discussed bandit under various assigned values of key input parameters to study how we can possibly improve this algorithm's performance under these extreme environmental conditions.
Subjects: AUTOREGRESSIVE models; REINFORCEMENT learning; MACHINE learning; ONLINE education; ROBBERS
Publication: American Journal of Undergraduate Research, 2024, Vol 21, Issue 2, p15
ISSN: 1536-4585
Publication type: Academic Journal
DOI: 10.33697/ajur.2024.116

Autoregressive Bandits in Near-Unstable or Unstable Environment.

Charniauski, Uladzimir; Yao Zheng

AUTOREGRESSIVE models; REINFORCEMENT learning; MACHINE learning; ONLINE education; ROBBERS

American Journal of Undergraduate Research, 2024, Vol 21, Issue 2, p15

1536-4585

Academic Journal

10.33697/ajur.2024.116