Portrait of Zichong Li

Zichong Li

I am a Ph.D. student in Machine Learning at Georgia Institute of Technology, advised by Prof. Tuo Zhao. I received my bachelor's and master's degrees from the University of Science and Technology of China. My research focuses on efficient and reliable large language model training, including optimizer and architecture design for pretraining, as well as training algorithms for mid- and post-training. My CV can be found here.

Recent Highlights

Publications

  1. NorMuon: Making Muon More Efficient and Scalable Zichong Li*, Liming Liu*, Chen Liang, Weizhu Chen, Tuo Zhao. ICML 2026. Spotlight, top 2.2%.
  2. Shuffle the Context: RoPE-Perturbed Self-Distillation for Long-Context Adaptation Zichong Li, Chen Liang, Liliang Ren, Tuo Zhao, Yelong Shen, Weizhu Chen. ICML 2026.
  3. COSMOS: A Hybrid Adaptive Optimizer for Efficient Training of Large Language Models Liming Liu, Zhenghao Xu, Zixuan Zhang, Hao Kang, Zichong Li, Chen Liang, Weizhu Chen, Tuo Zhao. ICLR 2026.
  4. SlimMoE: Structured Compression of Large MoE Models via Expert Slimming and Distillation Zichong Li, Chen Liang, Zixuan Zhang, Ilgee Hong, Young Jin Kim, Weizhu Chen, Tuo Zhao. COLM 2025.
  5. LLMs Can Generate a Better Answer by Aggregating Their Own Responses Zichong Li, Xinyu Feng, Yuheng Cai, Zixuan Zhang, Tianyi Liu, Chen Liang, Weizhu Chen, Haoyu Wang, Tuo Zhao. arXiv 2025.
  6. Mitigating Tail Latency for On-Device Inference with Load-Balanced Heterogeneous Models Mu Yuan, Lan Zhang, Di Duan, Liekang Zeng, Miao-Hui Song, Zichong Li, Guoliang Xing, Xiang-Yang Li. IEEE TMC 2025.
  7. Adaptive Preference Scaling for Reinforcement Learning with Human Feedback Ilgee Hong*, Zichong Li*, Alexander Bukharin, Yixiao Li, Haoming Jiang, Tianbao Yang, Tuo Zhao. NeurIPS 2024. *Equal contribution.
  8. Robust Reinforcement Learning from Corrupted Human Feedback Alexander Bukharin, Ilgee Hong, Haoming Jiang, Zichong Li, Qingru Zhang, Zixuan Zhang, Tuo Zhao. NeurIPS 2024.
  9. Beyond Point Prediction: Score Matching-based Pseudolikelihood Estimation of Neural Marked Spatio-Temporal Point Process Zichong Li, Qunzhi Xu, Zhenghao Xu, Yajun Mei, Tuo Zhao, Hongyuan Zha. ICML 2024.
  10. SMURF-THP: Score Matching-based Uncertainty Quantification for Transformer Hawkes Process Zichong Li, Yanbo Xu, Simiao Zuo, Haoming Jiang, Chao Zhang, Tuo Zhao, Hongyuan Zha. ICML 2023.
  11. Efficient Deep Ensemble Inference via Query Difficulty-dependent Task Scheduling Zichong Li, Lan Zhang, Mu Yuan, Miaohui Song, Qi Song. ICDE 2023.
  12. CoTel: Ontology-Neural Co-Enhanced Text Labeling Miaohui Song, Lan Zhang, Mu Yuan, Zichong Li, Qi Song, Yijun Liu, Guidong Zheng. WWW 2023.
  13. Transformer Hawkes Process Simiao Zuo, Haoming Jiang, Zichong Li, Tuo Zhao, Hongyuan Zha. ICML 2020.
  14. PRIMAL: A Linear Programming-based Sparse Learning Library in R and Python Qianli Shen*, Zichong Li*, Yujia Xie, Tuo Zhao. Software. *Equal contribution.

Education

Georgia Institute of Technology

Ph.D. in Machine Learning

Advisor: Prof. Tuo Zhao

Aug. 2023 - Present, Atlanta, GA

University of Science and Technology of China

M.S. in Data Science, rank 1/56

Advisor: Prof. Lan Zhang

Sept. 2020 - June 2023, Hefei, China

University of Science and Technology of China

B.S. in Mathematics and Applied Mathematics / Probability Statistics

Sept. 2016 - June 2020, Hefei, China