About Me
I am a rising 2nd year PhD student at UCB Sky Lab advised by Prof. Ion Stoica and Prof. Joseph Gonzalez. My current research focus lies broadly in agent infra including benchmark & data (e.g. Frontier-CS), inference (e.g. Continuum), context engineering automation (e.g. ACE, Combee)… Previously, I graduated from University of Chicago with BS in Math and CS (both honors), mainly working with Prof. Junchen Jiang.
I write about technology, business, and society at my blog site. The best way to follow me is on Twitter or Linkedin!
In previous life, I was the first employee of Tensormesh (creator of LMCache), where I led Product and Growth in the beginning. Research-wise, I was one of the very first people to optimize KV cache reuse beyond GPU HBM for LLMs (The CacheXXX papers).
I am open to collaborations and working with undergraduate students with excellence and tenacity. Please checkout the collaborations page if you are interested!
News
[May 2026] Starting summer intern at Anuttacon (Mihoyo’s AI branch) working on coding data!
[Aug 2025] Starting PhD at UCB!
[Apr 2025] Starting my first industry job as the first hire at Tensormesh!
Areas of Work
Data & Eval: Frontier CS: Coding benchmark for long-horizon agents with dense and continuous reward functions
Agent Improvement: ACE, Combee: Automatic context engineering
Agent Efficiency: Continuum, CacheGen, CacheBlend: Efficient Agent Inference, RL stack beyond vLLM and SGLang
Selected Publications
* indicates equivalent contribution
Combee: Scaling Prompt Learning for Self-Improving Language Model Agents
Hanchen Li*, Runyuan He*, Qizheng Zhang, Changxiu Ji, Qiuyang Mang, Xiaokun Chen, Lakshya A Agrawal, Wei-Liang Liao, Eric Yang, Alvin Cheung, James Zou, Kunle Olukotun, Ion Stoica, Joseph E. Gonzalez [Paper]Continuum: Efficient and Robust Multi-Turn LLM Agent Scheduling with KV Cache Time-to-Live
Hanchen Li, Qiuyang Mang, Runyuan He, Qizheng Zhang, Huanzhi Mao, Xiaokun Chen, Alvin Cheung, Joseph Gonzalez, Ion Stoica
Arxiv [Paper] [Preview Code]Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models
Qizheng Zhang, Changran Hu, Shubhangi Upasani, Boyuan Ma, Fenglu Hong, Vamsidhar Kamanuru, Jay Rainton, Chen Wu, Mengmeng Ji, Hanchen Li, Urmish Thakker, James Zou, Kunle Olukotun
ICLR 2026 [Paper] [Repo]CacheBlend: Fast Large Language Model Serving for RAG with Cached Knowledge Fusion
Jiayi Yao, Hanchen Li, Yuhan Liu, Siddhant Ray, Yihua Cheng, Qizheng Zhang, Kuntai Du, Shan Lu, Junchen Jiang
EuroSys 2025 (Best Paper Award!) [Paper]CacheGen: Fast Context Loading for Language Model Applications
Yuhan Liu, Hanchen Li, Kuntai Du, Jiayi Yao, Yihua Cheng, Yuyang Huang, Shan Lu, Michael Maire, Henry Hoffmann, Ari Holtzman, Ganesh Ananthanarayanan, Junchen Jiang
SIGCOMM 2024 [Paper] [Talk] [Slides]Towards More Economical Context-Augmented LLM Generation by Reusing Stored KV Cache
Hanchen Li*, Yuhan Liu*, Yihua Cheng, Kuntai Du, Junchen Jiang
NSDI Poster 2024 [Link]
All Publications (check Google Scholar)
Life
- My name in Chinese is 李翰宸 and I grew up in Nanjing, Jiangsu.
- I consider myself as an absurdist.
- I play soccer, basketball, tennis, and lift. I enjoyed 掼蛋(Guan Dan, a popular card game in
JiangsuChina) with friends but now I got back to reading.
