2026 Spring
Specific Requirements
-
We adopt a topic-based organization this semester. Presenters should select papers that fit the semester topics. Paper sources are no longer restricted to SOSP, OSDI, or any specific venues, as long as the paper is confirmed with the organizers at registration time and meets the quality bar.
-
The presentation format is flexible. For one paper, a full discussion should last 45–50 minutes, while a sharing presentation should last 30–35 minutes. Presenters are expected to prepare around 30–40 slides.
-
The presentation should clearly introduce the main idea, technical contributions, and key strengths and weaknesses of the paper. Relevant background and related work may also be included when appropriate.
Other Information
The playback video and text summary will be uploaded to bilibili and zhihu as soon as possible.
Schedule
April 7
- 💡 Kick-off meeting
- 🙎♂️ Zewen Jin, Chizheng Fang, Yuzhe Li, Mulong Li and Shen Fu
- 📕 slides
April 14
Topic I
- 💡 [CVPR'26] AdaCluster: Adaptive Query-Key Clustering for Sparse Attention in Video Generation
- 🙎♂️ Haoyue Tan
- 📕 slides
- 📃 Q&A summary, 📺 video
Topic II
- 💡 [arXiv] IndexCache: Accelerating Sparse Attention via Cross-Layer Index Reuse
- 🙎♂️ Ruibo Liu, Ouxiang Zhou
- 📕 slides
- 📺 video
April 21
Topic I
- 💡 [arXiv] Token Sparse Attention: Efficient Long-Context Inference with Interleaved Token Selection
- 🙎♂️ Chengjie Tang, Shen Fu
- 📕 slides
April 28
Topic I
- 💡 [arXiv] PROBE: Co-Balancing Computation and Communication in MoE Inference via Real-Time Predictive Prefetching
- 🙎♂️ Qinghe Wang
- 📕 slides
Topic II
- 💡 [arXiv] FluxMoE: Decoupling Expert Residency for High-Performance MoE Serving
- 🙎♂️ Long Zhao
- 📕 slides
May 12
- 💡 Comprehensive introduction of DeepSeek-V4 technical report (PART I)
- 🙎♂️ Chengru Yang, Chengjie Tang, Ouxiang Zhou, Shen Fu, Ruibo Liu, Yinhe Chen
- 📕 slides
May 19
- 💡 Comprehensive introduction of DeepSeek-V4 technical report (PART II)
- 🙎♂️ Congkun Ai, Yuzhe Li, Jiahui Tan, Chenhan Wang, Chizheng Fang
- 📕 slides
May 26
- 💡 [arXiv] JANUS: Disaggregating Attention and Experts for Scalable MoE Inference
- 🙎♂️ Chizheng Fang
- 📕 slides