Recent Highlights
-
2026
NorMuon was selected as an ICML 2026 Spotlight paper! The optimizer achieves new leaderboard results on modded-nanogpt, and the official implementation is available at zichongli5/NorMuon.
-
2025
SlimMoE open models Phi-tiny-MoE-instruct and Phi-mini-MoE-instruct were released on Hugging Face in collaboration with Microsoft, with 850K+ total downloads!
Publications
- NorMuon: Making Muon More Efficient and Scalable Zichong Li*, Liming Liu*, Chen Liang, Weizhu Chen, Tuo Zhao. ICML 2026. Spotlight, top 2.2%.
- Shuffle the Context: RoPE-Perturbed Self-Distillation for Long-Context Adaptation Zichong Li, Chen Liang, Liliang Ren, Tuo Zhao, Yelong Shen, Weizhu Chen. ICML 2026.
- COSMOS: A Hybrid Adaptive Optimizer for Efficient Training of Large Language Models Liming Liu, Zhenghao Xu, Zixuan Zhang, Hao Kang, Zichong Li, Chen Liang, Weizhu Chen, Tuo Zhao. ICLR 2026.
- SlimMoE: Structured Compression of Large MoE Models via Expert Slimming and Distillation Zichong Li, Chen Liang, Zixuan Zhang, Ilgee Hong, Young Jin Kim, Weizhu Chen, Tuo Zhao. COLM 2025.
- LLMs Can Generate a Better Answer by Aggregating Their Own Responses Zichong Li, Xinyu Feng, Yuheng Cai, Zixuan Zhang, Tianyi Liu, Chen Liang, Weizhu Chen, Haoyu Wang, Tuo Zhao. arXiv 2025.
- Mitigating Tail Latency for On-Device Inference with Load-Balanced Heterogeneous Models Mu Yuan, Lan Zhang, Di Duan, Liekang Zeng, Miao-Hui Song, Zichong Li, Guoliang Xing, Xiang-Yang Li. IEEE TMC 2025.
- Adaptive Preference Scaling for Reinforcement Learning with Human Feedback Ilgee Hong*, Zichong Li*, Alexander Bukharin, Yixiao Li, Haoming Jiang, Tianbao Yang, Tuo Zhao. NeurIPS 2024. *Equal contribution.
- Robust Reinforcement Learning from Corrupted Human Feedback Alexander Bukharin, Ilgee Hong, Haoming Jiang, Zichong Li, Qingru Zhang, Zixuan Zhang, Tuo Zhao. NeurIPS 2024.
- Beyond Point Prediction: Score Matching-based Pseudolikelihood Estimation of Neural Marked Spatio-Temporal Point Process Zichong Li, Qunzhi Xu, Zhenghao Xu, Yajun Mei, Tuo Zhao, Hongyuan Zha. ICML 2024.
- SMURF-THP: Score Matching-based Uncertainty Quantification for Transformer Hawkes Process Zichong Li, Yanbo Xu, Simiao Zuo, Haoming Jiang, Chao Zhang, Tuo Zhao, Hongyuan Zha. ICML 2023.
- Efficient Deep Ensemble Inference via Query Difficulty-dependent Task Scheduling Zichong Li, Lan Zhang, Mu Yuan, Miaohui Song, Qi Song. ICDE 2023.
- CoTel: Ontology-Neural Co-Enhanced Text Labeling Miaohui Song, Lan Zhang, Mu Yuan, Zichong Li, Qi Song, Yijun Liu, Guidong Zheng. WWW 2023.
- Transformer Hawkes Process Simiao Zuo, Haoming Jiang, Zichong Li, Tuo Zhao, Hongyuan Zha. ICML 2020.
- PRIMAL: A Linear Programming-based Sparse Learning Library in R and Python Qianli Shen*, Zichong Li*, Yujia Xie, Tuo Zhao. Software. *Equal contribution.
Education
University of Science and Technology of China
M.S. in Data Science, rank 1/56
Advisor: Prof. Lan Zhang
University of Science and Technology of China
B.S. in Mathematics and Applied Mathematics / Probability Statistics