Belief Projection-Based Reinforcement Learning for Environments with Delayed Feedbac
- Title
- Belief Projection-Based Reinforcement Learning for Environments with Delayed Feedbac
- Authors
- Kim, Jangwon; Kim, Hangyeol; kang, Jiwook; Baek, Jongchan; HAN, SOOHEE
- Date Issued
- 2023-12-13
- Publisher
- NeurIPS 재단
- Abstract
- We present a novel actor-critic algorithm for an environment with delayed feedback,
which addresses the state-space explosion problem of conventional approaches.
Conventional approaches use an augmented state constructed from the last observed
state and actions executed since visiting the last observed state Using the
augmented state space, the correct Markov decision process for delayed environments
can be constructed; however, this causes the state space to explode as the
number of delayed timesteps increases, leading to slow convergence. Our proposed
algorithm, called Belief-Projection-Based Q-learning (BPQL), addresses
the state-space explosion problem by evaluating the values of the critic for which
the input state size is equal to the original state-space size rather than that of the
augmented one. We compare BPQL to traditional approaches in continuous control
tasks and demonstrate that it significantly outperforms other algorithms in terms of
asymptotic performance and sample efficiency. We also show that BPQL solves
long-delayed environments, which conventional approaches are unable to do.
- URI
- https://oasis.postech.ac.kr/handle/2014.oak/122357
- Article Type
- Conference
- Citation
- 37th Conference on Neural Information Processing Systems (NeurIPS 2023)., 2023-12-13
- Files in This Item:
- There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.