EBSCO Logo
Connecting you to content on EBSCOhost
Results
Title

Autoregressive Bandits in Near-Unstable or Unstable Environment.

Authors

Charniauski, Uladzimir; Yao Zheng

Abstract

AutoRegressive Bandits (ARBs) is a novel model of a sequential decision-making problem as an autoregressive (AR) process. In this online learning setting, the observed reward follows an autoregressive process, whose action parameters are unknown to the agent and create an AR dynamic that depends on actions the agent chooses. This study empirically demonstrates how assigning the extreme values of systemic stability indexes and other reward-governing parameters severely impairs the ARBs learning in the respective environment. We show that this algorithm suffers numerically larger regrets of higher forms under a weakly stable environment and a strictly exponential regret under the unstable environment over the considered optimization horizon. We also test ARBs against other bandit baselines in both weakly stable and unstable systems to investigate the deteriorating effect of dropping systemic stability on their performance and demonstrate the potential advantage of choosing other competing algorithms in case of weakened stability. Finally, we measure the discussed bandit under various assigned values of key input parameters to study how we can possibly improve this algorithm's performance under these extreme environmental conditions.

Subjects

AUTOREGRESSIVE models; REINFORCEMENT learning; MACHINE learning; ONLINE education; ROBBERS

Publication

American Journal of Undergraduate Research, 2024, Vol 21, Issue 2, p15

ISSN

1536-4585

Publication type

Academic Journal

DOI

10.33697/ajur.2024.116

EBSCO Connect | Privacy policy | Terms of use | Copyright | Manage my cookies
Journals | Subjects | Sitemap
© 2025 EBSCO Industries, Inc. All rights reserved