Can decentralized algorithms outperform centralized algorithms? a case study for decentralized parallel stochastic gradient descent X Lian, C Zhang, H Zhang, CJ Hsieh, W Zhang, J Liu
Advances in neural information processing systems 30, 2017
1418 2017 Asynchronous decentralized parallel stochastic gradient descent X Lian, W Zhang, C Zhang, J Liu
International Conference on Machine Learning, 3043-3052, 2018
594 2018 Asynchronous parallel stochastic gradient for nonconvex optimization X Lian, Y Huang, Y Li, J Liu
Advances in Neural Information Processing Systems, 2737-2745, 2015
580 2015 : Decentralized Training over Decentralized DataH Tang, X Lian, M Yan, C Zhang, J Liu
International Conference on Machine Learning, 4848-4856, 2018
422 2018 Staleness-aware Async-SGD for Distributed Deep Learning W Zhang, S Gupta, X Lian, J Liu
International Joint Conference on Artificial Intelligence, 2016
338 2016 Doublesqueeze: Parallel stochastic gradient descent with double-pass error-compensated compression H Tang, C Yu, X Lian, T Zhang, J Liu
International Conference on Machine Learning, 6155-6165, 2019
279 2019 Douzero: Mastering doudizhu with self-play deep reinforcement learning D Zha, J Xie, W Ma, S Zhang, X Lian, X Hu, J Liu
international conference on machine learning, 12333-12344, 2021
148 2021 A Comprehensive Linear Speedup Analysis for Asynchronous Stochastic Parallel Optimization from Zeroth-Order to First-Order X Lian, H Zhang, CJ Hsieh, Y Huang, J Liu
Advances in Neural Information Processing Systems, 2016
128 2016 Finite-sum Composition Optimization via Variance Reduced Gradient Descent X Lian, M Wang, J Liu
Artificial Intelligence and Statistics, 2017
99 2017 1-bit adam: Communication efficient large-scale training with adam’s convergence speed H Tang, S Gan, AA Awan, S Rajbhandari, C Li, X Lian, J Liu, C Zhang, ...
International Conference on Machine Learning, 10118-10129, 2021
96 2021 Asynchronous Parallel Greedy Coordinate Descent Y You*, X Lian*(equal contribution), J Liu, HF Yu, I Dhillon, J Demmel, ...
Advances in Neural Information Processing Systems, 2016
54 2016 Revisit batch normalization: New understanding and refinement via composition optimization X Lian, J Liu
The 22nd International Conference on Artificial Intelligence and Statistics …, 2019
53 2019 Deepsqueeze: Decentralization meets error-compensated compression H Tang, X Lian, S Qiu, L Yuan, C Zhang, T Zhang, J Liu
arXiv preprint arXiv:1907.07346, 2019
45 2019 Stochastic recursive momentum for policy gradient methods H Yuan, X Lian, J Liu, Y Zhou
arXiv preprint arXiv:2003.04302, 2020
34 2020 Efficient smooth non-convex stochastic compositional optimization via stochastic recursive gradient descent W Hu, CJ Li, X Lian, J Liu, H Yuan
Advances in Neural Information Processing Systems 32, 2019
34 2019 Bagua: scaling up distributed learning with system relaxations S Gan, X Lian, R Wang, J Chang, C Liu, H Shi, S Zhang, X Li, T Sun, ...
arXiv preprint arXiv:2107.01499, 2021
30 2021 Persia: An open, hybrid system scaling deep learning-based recommenders up to 100 trillion parameters X Lian, B Yuan, X Zhu, Y Wang, Y He, H Wu, L Sun, H Lyu, C Liu, X Dong, ...
Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and …, 2022
25 2022 Persia: a hybrid system scaling deep learning based recommenders up to 100 trillion parameters X Lian, B Yuan, X Zhu, Y Wang, Y He, H Wu, L Sun, H Lyu, C Liu, X Dong, ...
arXiv preprint arXiv:2111.05897, 2021
14 2021 Stochastic recursive variance reduction for efficient smooth non-convex compositional optimization H Yuan, X Lian, J Liu
arXiv preprint arXiv:1912.13515, 2019
12 2019 NMR evidence for field-induced ferromagnetism in ( )OHFeSe superconductor YP Wu, D Zhao, XR Lian, XF Lu, NZ Wang, XG Luo, XH Chen, T Wu
Physical Review B 91 (12), 125107, 2015
12 2015