51 |
Õ(T-1) Convergence to (Coarse) Correlated Equilibria in Full-Information General-Sum Markov Games |
Weichao Mao, Haoran Qiu, Chen Wang, Hubertus Franke, Zbigniew Kalbarczyk and Tamer Başar |
96 |
Learning ϵ-Nash Equilibrium Policies in Stochastic Games with Unknown Independent Chains Using Online Mirror Descent |
Tiancheng Qin and S. Rasoul Etesami |
121 |
Bounded Robustness in Reinforcement Learning via Lexicographic Objectives |
Daniel Jarne Ornia, Licio Romao, Lewis Hammond, Manuel Mazo Jr and Alessandro Abate |
227 |
Conservative Model-based Imitation Learning |
Victor Kolev, Rafael Rafailov, Kyle Hatch, Jiajun Wu and Chelsea Finn |