Distributed learning of the global maximum in a two-player stochastic game with identical payoffs
Academic Article
Overview
Research
Identity
Additional Document Info
Other
View All
Overview
abstract
Little is known about the distributed learning of the global maximum in a stochastic framework when there is no communication between the decision-makers. The case of two decision-makers is considered, and prior knowledge is assumed about the expected rewards. The prior knowledge captures the asymmetries that may be present in the rewards. It is shown that each decision-maker completely unaware of the other converges to the global optimum with arbitrary accuracy over time.