About Me

I am a 1st year PhD student at UCB Sky Lab advised by Prof. Ion Stoica and Prof. Joseph Gonzalez. My current research focus is on the intersection between systems and AI. Previously, I graduated from University of Chicago with BS in Math and CS (both honors), mainly working with Prof. Junchen Jiang.

I was the first employee of Tensormesh, where I led Product and Growth in its beginning days. Research-wse, I was one of the very first people to optimize KV cache reuse beyond GPU HBM for LLMs.

I am always open to collaborations and working with undergraduate students with good CS backgrounds. Please checkout the collaborations page if you are interested!

Selected Publications

* indicates equivalent contribution

  • Continuum: Efficient and Robust Multi-Turn LLM Agent Scheduling with KV Cache Time-to-Live
    Hanchen Li**, Qiuyang Mang, Runyuan He, Qizheng Zhang, Huanzhi Mao, Xiaokun Chen, Alvin Cheung, Joseph Gonzalez, Ion Stoica
    **Arxiv
    [Paper]

  • CacheBlend: Fast Large Language Model Serving for RAG with Cached Knowledge Fusion
    Jiayi Yao, Hanchen Li, Yuhan Liu, Siddhant Ray, Yihua Cheng, Qizheng Zhang, Kuntai Du, Shan Lu, Junchen Jiang
    EuroSys 2025 (Best Paper Award!) [Paper]

  • CacheGen: Fast Context Loading for Language Model Applications
    Yuhan Liu, Hanchen Li, Kuntai Du, Jiayi Yao, Yihua Cheng, Yuyang Huang, Shan Lu, Michael Maire, Henry Hoffmann, Ari Holtzman, Ganesh Ananthanarayanan, Junchen Jiang
    SIGCOMM 2024 [Paper] [Talk] [Slides]

  • Towards More Economical Context-Augmented LLM Generation by Reusing Stored KV Cache
    Hanchen Li*, Yuhan Liu*, Yihua Cheng, Kuntai Du, Junchen Jiang
    NSDI Poster 2024 [Link]

All Publications (check Google Scholar)

Life

  • My name in Chinese is 李翰宸 and I grew up in Nanjing, Jiangsu.
  • I play soccer, basketball, and weightlift. I also enjoy 掼蛋(Guan Dan, a popular card game in JiangsuChina) with friends.