Fast global convergence of natural policy gradient methods with entropy regularization S Cen, C Cheng, Y Chen, Y Wei, Y Chi
Operations Research 70 (4), 2563-2578, 2022
225 2022 Communication-efficient distributed optimization in networks with gradient tracking and variance reduction B Li, S Cen, Y Chen, Y Chi
Journal of Machine Learning Research 21 (180), 1-51, 2020
135 * 2020 Policy mirror descent for regularized reinforcement learning: A generalized framework with linear convergence W Zhan, S Cen, B Huang, Y Chen, JD Lee, Y Chi
SIAM Journal on Optimization 33 (2), 1061-1091, 2023
86 2023 Fast policy extragradient methods for competitive games with entropy regularization S Cen, Y Wei, Y Chi
Advances in Neural Information Processing Systems 34, 27952-27964, 2021
85 2021 A stochastic semismooth newton method for nonsmooth nonconvex optimization A Milzarek, X Xiao, S Cen, Z Wen, M Ulbrich
SIAM Journal on Optimization 29 (4), 2916-2948, 2019
41 2019 Faster Last-iterate Convergence of Policy Optimization in Zero-Sum Markov Games S Cen, Y Chi, SS Du, L Xiao
International Conference on Learning Representations (ICLR), 2023
40 2023 Convergence of distributed stochastic variance reduced methods without sampling extra data S Cen, H Zhang, Y Chi, W Chen, TY Liu
IEEE Transactions on Signal Processing 68, 3976-3989, 2020
32 2020 Independent natural policy gradient methods for potential games: Finite-time global convergence with entropy regularization S Cen, F Chen, Y Chi
2022 IEEE 61st Conference on Decision and Control (CDC), 2833-2838, 2022
15 2022 Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF S Cen, J Mei, K Goshvadi, H Dai, T Yang, S Yang, D Schuurmans, Y Chi, ...
arXiv preprint arXiv:2405.19320, 2024
8 2024 Asynchronous Gradient Play in Zero-Sum Multi-agent Games R Ao, S Cen, Y Chi
International Conference on Learning Representations (ICLR), 2023
7 2023 Federated Natural Policy Gradient Methods for Multi-task Reinforcement Learning T Yang, S Cen, Y Wei, Y Chen, Y Chi
arXiv preprint arXiv:2311.00201, 2023
4 2023 Faster WIND: Accelerating Iterative Best-of- Distillation for LLM Alignment T Yang, J Mei, H Dai, Z Wen, S Cen, D Schuurmans, Y Chi, B Dai
arXiv preprint arXiv:2410.20727, 2024
2024 Beyond Expectations: Learning with Stochastic Dominance Made Practical S Cen, J Mei, H Dai, D Schuurmans, Y Chi, B Dai
arXiv preprint arXiv:2402.02698, 2024
2024 Global Convergence of Policy Gradient Methods in Reinforcement Learning, Games and Control S Cen, Y Chi
arXiv preprint arXiv:2310.05230, 2023
2023