2025 Spring
Specific Requirements
- We focus on the latest papers from SOSP and OSDI, as well as papers released on arXiv. Each time presenters select one paper from SOSP or OSDI and one from arXiv.
- The presentation follows a "1+N" format, where one person delivers the main content while supporting members assist with preparation and manage the Q&A session. These supporting members are also encouraged to contribute to the presentation.
- The discussion should provide a thorough analysis of the paperβs strengths and weaknesses, along with a comprehensive review of related work from the past three years. The presentation must be at least 45 minutes long.
Other Information
The playback video and text summary will be uploaded to bilibili and zhihu as soon as possible.
Schedule
February 25
- π‘ Kick-off meeting
- πββοΈ Jiyang Wang, Kunzhao Xu and Cheng Li
- π slides
March 11
- π‘ Comprehensive introduction of DeepSeek-AI's technical report (PART β )
- πββοΈ Xin Ren, Tonghuan Xiao, Jiahui Tan, Yandong Shi, Kunzhao Xu, Yifei Liu, Chongzhuo Yang, Jiaan Zhu, Zewen Jin, Yinhe Chen, Ping Gong, Guanbin Xu, Haiquan Wang, Quan Zhou and Chaoyi Ruan
- π MLA slides, π DualPipe slides, π FP8 Training slides, π MTP slides, π Q&A summary, πΊ video
March 18
Topic β
- π‘ Comprehensive introduction of DeepSeek-AI's technical report (PART β ‘)
- πββοΈ Xin Ren, Tonghuan Xiao, Jiahui Tan, Yandong Shi, Kunzhao Xu, Yifei Liu, Chongzhuo Yang, Jiaan Zhu, Zewen Jin, Yinhe Chen, Ping Gong, Guanbin Xu, Haiquan Wang, Quan Zhou and Chaoyi Ruan
- π RL slides, π 3fs slides
Topic β ‘
- π‘ [OSDI'24] Ladder: Enabling Efficient Low-Precision Deep Learning Computing through Hardware-aware Tensor Transformation
- πββοΈ Chengru Yang
- π slides
Summary and Video
March 25
Topic β
- π‘ [OSDI'24] FairyWren: A Sustainable Cache for Emerging Write-Read-Erase Flash Interfaces
- πββοΈ Qingyuan Chen
- π slides
Topic β ‘
- π‘ [arXiv] fMoE: Fine-Grained Expert Offloading for Large Mixture-of-Experts Serving
- πββοΈ Jia He, Jiaqi Ruan
- π slides
Summary and Video
April 1
Topic β
- π‘ [SOSP'24] CHIME: A Cache-Efficient and High-Performance Hybrid Index on Disaggregated Memory
- πββοΈ Sen Han
Topic β ‘
- π‘ [arXiv] Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache
- πββοΈ Tonghuan Xiao, Xin Ren
April 8
Topic β
- π‘ [OSDI'25] Achieving Low-Latency Graph-Based Vector Search via Aligning Best-First Search Algorithm with SSD
- πββοΈ Hengyu Liang
Topic β ‘
- π‘ [arXiv] Klotski: Efficient Mixture-of-Expert Inference via Expert-Aware Multi-Batch Pipeline
- πββοΈ Jiawei Yi