Online Learning Algorithm for Distributed Convex Optimization With Time-Varying Coupled Constraints and Bandit Feedback.

abstract

This article focuses on multiagent distributed-constrained optimization problems in a dynamic environment, in which a group of agents aims to cooperatively optimize a sum of time-changing local cost functions subject to time-varying coupled constraints. Both the local cost functions and constraint functions are unrevealed to an individual agent until an action is submitted. We first investigate a gradient-feedback scenario, where each agent can access both values and gradients of cost functions and constraint functions owned by itself at the chosen action. Then, we design a distributed primal-dual online learning algorithm and show that the proposed algorithm can achieve the sublinear bounds for both the regret and constraint violations. Furthermore, we extend the gradient-feedback algorithm to a gradient-free setup, where an individual agent has only attained the values of local cost functions and constraint functions at two queried points near the selected action. We develop a bandit version of the previous method and give the explicitly sublinear bounds on the expected regret and expected constraint violations. The results indicate that the bandit algorithm can achieve almost the same performance as the gradient-feedback algorithm under wild conditions. Finally, numerical simulations on an electric vehicle charging problem demonstrate the effectiveness of the proposed algorithms.

authors

Huang, Tingwen

published proceedings

IEEE Trans Cybern

author list (cited authors)

Li, J., Gu, C., Wu, Z., & Huang, T.

citation count

23

complete list of authors

Li, Jueyou||Gu, Chuanye||Wu, Zhiyou||Huang, Tingwen

publication date

February 2022

publisher

Institute of Electrical and Electronics Engineers (IEEE) Publisher

published in

IEEE TRANSACTIONS ON CYBERNETICS Journal

keywords

Bandit Feedback
Convex Functions
Cost Function
Cybernetics
Distributed Convex Optimization
Heuristic Algorithms
Linear Programming
Online Learning
Regret Analysis
Resource Management
Time-varying Constraint

Digital Object Identifier (DOI)

10.1109/TCYB.2020.2990796

start page

1009

end page

1020

volume

52

issue

2

URL

http://dx.doi.org/10.1109/tcyb.2020.2990796

user-defined tag

7 Affordable and Clean Energy

Online Learning Algorithm for Distributed Convex Optimization With Time-Varying Coupled Constraints and Bandit Feedback. Academic Article

Overview

abstract

authors

published proceedings

author list (cited authors)

citation count

complete list of authors

publication date

publisher

published in

Research

keywords

Identity

Digital Object Identifier (DOI)

Additional Document Info

start page

end page

volume

issue

Other

URL

user-defined tag