Scaling Up Reinforcement Learning through Targeted Exploration

abstract

Recent Reinforcement Learning (RL) algorithms, such as R-MAX, make (with high probability) only a small number of poor decisions. In practice, these algorithms do not scale well as the number of states grows because the algorithms spend too much effort exploring. We introduce an RL algorithm State TArgeted R-MAX (STAR-MAX) that explores a subset of the state space, called the exploration envelope . When equals the total state space, STAR-MAX behaves identically to R-MAX. When is a subset of the state space, to keep exploration within , a recovery rule is needed. We compared existing algorithms with our algorithm employing various exploration envelopes. With an appropriate choice of , STAR-MAX scales far better than existing RL algorithms as the number of states increases. A possible drawback of our algorithm is its dependence on a good choice of and . However, we show that an effective recovery rule can be learned on-line and can be learned from demonstrations. We also find that even randomly sampled exploration envelopes can improve cumulative rewards compared to R-MAX. We expect these results to lead to more efficient methods for RL in large-scale problems.

authors

Choe, Yoonsuck

published proceedings

Proceedings of the AAAI Conference on Artificial Intelligence

author list (cited authors)

Mann, T., & Choe, Y.

citation count

1

complete list of authors

Mann, Timothy||Choe, Yoonsuck

publication date

November 2011

publisher

Association for the Advancement of Artificial Intelligence (AAAI) Publisher

published in

Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence Journal

keywords

46 Information And Computing Sciences
4602 Artificial Intelligence
4611 Machine Learning

Digital Object Identifier (DOI)

10.1609/aaai.v25i1.7929

International Standard Book Number (ISBN) 13

9781577355083

start page

435

end page

440

volume

25

issue

1

URL

http%3A%2F%2Fdx.doi.org%2F10.1609%2Faaai.v25i1.7929

Scaling Up Reinforcement Learning through Targeted Exploration Conference Paper

Overview

abstract

authors

published proceedings

author list (cited authors)

citation count

complete list of authors

publication date

publisher

published in

Research

keywords

Identity

Digital Object Identifier (DOI)

International Standard Book Number (ISBN) 13

Additional Document Info

start page

end page

volume

issue

Other

URL