Zhuohan Li

Cited by

	All	Since 2019
Citations	3645	3638
h-index	15	15
i10-index	16	16

1600

800

400

1200

20192020202120222023202429 124 175 241 1552 1510

Public access

View all

6 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Ion StoicaProfessor of Computer Science, UC BerkeleyVerified email at cs.berkeley.edu
Hao ZhangUC San DiegoVerified email at ucsd.edu
Siyuan ZhuangPhD Student, UC BerkeleyVerified email at berkeley.edu
Joseph E. GonzalezProfessor of Computer Science, UC BerkeleyVerified email at berkeley.edu
Zi LinUC San DiegoVerified email at ucsd.edu
Di HePeking UniversityVerified email at pku.edu.cn
Danyang ZhuoDuke UniversityVerified email at duke.edu
Tao QinSenior Principal Research Manager, Microsoft ResearchVerified email at microsoft.com
Liwei WangProfessor, Peking UniversityVerified email at cis.pku.edu.cn
Tie-Yan LiuDistinguished Scientist, Microsoft Research AI4Science | IEEE Fellow | ACM Fellow | AAIA FellowVerified email at microsoft.com
Zhiqing SunCarnegie Mellon University | Language Technologies InstituteVerified email at cs.cmu.edu
Zhifeng ChenGoogle Inc.Verified email at google.com
Kevin LinUC BerkeleyVerified email at berkeley.edu
Sheng ShenUC BerkeleyVerified email at berkeley.edu
Eric WallaceUC BerkeleyVerified email at berkeley.edu
Kurt KeutzerProfessor of the Graduate School, EECS, University of California, BerkeleyVerified email at berkeley.edu
Yuanzhong XuGoogle DeepMindVerified email at utexas.edu
Linyuan GongUC BerkeleyVerified email at berkeley.edu
Dawn SongProfessor of Computer Science, UC BerkeleyVerified email at cs.berkeley.edu
Stephanie WangPhD student, UC BerkeleyVerified email at cs.berkeley.edu

Zhuohan Li

UC Berkeley

Verified email at berkeley.edu - Homepage

Machine Learning Distributed Systems


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality WL Chiang, Z Li, Z Lin, Y Sheng, Z Wu, H Zhang, L Zheng, S Zhuang, ... See https://vicuna. lmsys. org (accessed 14 April 2023) 2 (3), 6, 2023	1127*	2023
Judging llm-as-a-judge with mt-bench and chatbot arena L Zheng, WL Chiang, Y Sheng, S Zhuang, Z Wu, Y Zhuang, Z Lin, Z Li, ... Advances in Neural Information Processing Systems 36, 2024	961*	2024
Efficient memory management for large language model serving with pagedattention W Kwon, Z Li, S Zhuang, Y Sheng, L Zheng, CH Yu, J Gonzalez, H Zhang, ... Proceedings of the 29th Symposium on Operating Systems Principles, 611-626, 2023	274	2023
Train big, then compress: Rethinking model size for efficient training and inference of transformers Z Li, E Wallace, S Shen, K Lin, K Keutzer, D Klein, J Gonzalez International Conference on Machine Learning, 5958-5968, 2020	254	2020
Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning L Zheng, Z Li, H Zhang, Y Zhuang, Z Chen, Y Huang, Y Wang, Y Xu, ... arXiv preprint arXiv:2201.12023, 2022	188	2022
Understanding and improving transformer from a multi-particle dynamic system point of view Y Lu, Z Li, D He, Z Sun, B Dong, T Qin, L Wang, TY Liu arXiv preprint arXiv:1906.02762, 2019	157	2019
Efficient training of bert by progressively stacking L Gong, D He, Z Li, T Qin, L Wang, T Liu International conference on machine learning, 2337-2346, 2019	127	2019
Flexgen: High-throughput generative inference of large language models with a single gpu Y Sheng, L Zheng, B Yuan, Z Li, M Ryabinin, B Chen, P Liang, C Ré, ... International Conference on Machine Learning, 31094-31116, 2023	114	2023
Fast structured decoding for sequence models Z Sun, Z Li, H Wang, D He, Z Lin, Z Deng Advances in Neural Information Processing Systems 32, 2019	112	2019
Hint-based training for non-autoregressive machine translation Z Li, Z Lin, D He, F Tian, T Qin, L Wang, TY Liu	75	2018
Terapipe: Token-level pipeline parallelism for training large-scale language models Z Li, S Zhuang, S Guo, D Zhuo, H Zhang, D Song, I Stoica International Conference on Machine Learning, 6543-6552, 2021	72	2021
Towards binary-valued gates for robust lstm training Z Li, D He, F Tian, W Chen, T Qin, L Wang, T Liu International Conference on Machine Learning, 2995-3004, 2018	58	2018
{AlpaServe}: Statistical multiplexing with model parallelism for deep learning serving Z Li, L Zheng, Y Zhong, V Liu, Y Sheng, X Jin, Y Huang, Z Chen, H Zhang, ... 17th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2023	53	2023
Lmsys-chat-1m: A large-scale real-world llm conversation dataset L Zheng, WL Chiang, Y Sheng, T Li, S Zhuang, Z Wu, Y Zhuang, Z Li, ... arXiv preprint arXiv:2309.11998, 2023	28	2023
Hoplite: efficient and fault-tolerant collective communication for task-based distributed systems S Zhuang, Z Li, D Zhuo, S Wang, E Liang, R Nishihara, P Moritz, I Stoica Proceedings of the 2021 ACM SIGCOMM 2021 Conference, 641-656, 2021	23	2021
On optimizing the communication of model parallelism Y Zhuang, L Zheng, Z Li, E Xing, Q Ho, J Gonzalez, I Stoica, H Zhang, ... Proceedings of Machine Learning and Systems 5, 2023	15	2023
Fairness in serving large language models Y Sheng, S Cao, D Li, B Zhu, Z Li, D Zhuo, JE Gonzalez, I Stoica arXiv preprint arXiv:2401.00588, 2023	5	2023
Rearchitecting in-memory object stores for low latency D Zhuo, K Zhang, Z Li, S Zhuang, S Wang, A Chen, I Stoica Proceedings of the VLDB Endowment, 555-568, 2021	2	2021
Simple and Automatic Distributed Machine Learning on Ray H Zhang, Z Li, L Zheng, I Stoica Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data …, 2021		2021
Student Cluster Competition 2017, Team Peking University: Reproducing vectorization of the Tersoff multi-body potential on the Intel Broadwell architecture Z Fu, L Yang, W Hou, Z Li, Y Wu, Y Cheng, X Wang, Y Liang Parallel Computing 78, 28-32, 2018		2018

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors