2024 Fall
Speicific Requirements
- We focus on papers published on SOSP and OSDI within 2 years .
- We adopt a "1+N" presentation format: one person will present the main content, while the other N members will assist with preparation and handle the Q&A session.
- We expect a detailed discussion of the paper's strengths and weaknesses, as well as a comprehensive overview of related works from the past three years. The presentation should last no less than 45 minutes.
Other Information
The playback video and text summary will be uploaded to bilibili and zhihu as soon as possible.
Schedule
September 03
- 💡 [OSDI'24] Parrot: Efficient Serving of LLM-based Applications with Semantic Variable
- 🙎♂️ Chaoyi Ruan, Kunzhao Xu, Bosen Yang
- 📕 slides, 📃 Q&A summary, 📺 video
September 10
- 💡 [SOSP'23] PIT: Optimization of Dynamic Sparse Deep Learning Models via Permutation Invariant Transformation
- 🙎♂️ Jiaan Zhu (Andy), Qinghe Wang, Long Zhao
- 📕 slides, 📃 Q&A summary, 📺 video
September 18
- 💡 [OSDI'24] Nomad: Non-Exclusive Memory Tiering via Transactional Page Migration
- 🙎♂️ Jiahao Li
- 📕 slides, 📃 Q&A summary, 📺 video
September 24
- 💡 [OSDI'24] μSlope: High Compression and Fast Search on Semi-Structured Logs
- 🙎♂️ Yuming Xu, Hengyu Liang
- 📕 slides, 📃 Q&A summary, 📺 video
October 08
- 💡 How (and How Not) to Write a Good Systems Paper
- 🙎♂️ Xiaosong Ma (MBZUAI), Kang Chen (THU), Cheng Li (USTC)
- 📕 slides
October 15
- 💡 [OSDI'24] InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management
- 🙎♂️ Ping Gong, Jiawei Yi, Juncheng Zhang
- 📕 slides, 📃 Q&A summary, 📺 video
October 22
- 💡 [OSDI'24] Harvesting Memory-bound CPU Stall Cycles in Software with MSH
- 🙎♂️ Luofan Chen, Jiyang Wang
- 📕 slides, 📃 Q&A summary, 📺 video
October 29
- 💡 [OSDI'24] ServerlessLLM: Low-Latency Serverless Inference for Large Language Models
- 🙎♂️ Mingxuan Liu (NPU, Northwestern Polytechnical University)
- 📕 slides, 📃 Q&A summary, 📺 video
November 05
- 💡 [SOSP'24] ReCycle: Resilient Training of Large DNNs using Pipeline Adaptation
- 🙎♂️ Wenhao Huang, Yichi Chen, Yanjie Wang (all from TJU, Tianjin University)
- 📕 slides, 📃 Q&A summary, 📺 video
November 12
- 💡 [SOSP'24] TENPLEX: Dynamic Parallelism for Deep Learning using Parallelizable Tensor Collections
- 🙎♂️ Zhaolin Duan, Yuhang Zhang, Yihan Wang (all from TJU, Tianjin University)
- 📕 slides, 📃 Q&A summary, 📺 video
November 19
- 💡 [SOSP'23] UGache: A Unified GPU Cache for Embedding-based Deep Learning
- 🙎♂️ Zheng Yang, Yicheng Zhang
- 📕 slides, 📺 video
🔥 November 26
- 💡 [SOSP'24] BIZA: Design of Self-Governing Block-Interface ZNS AFA for Endurance and Performance
- 🙎♂️ Jingze Huo, Qingyuan Chen
December 03
- 💡 [SOSP'24] LoongServe: Efficiently Serving Long-Context Large Language Models with Elastic Sequence Parallelism
- 🙎♂️ Zewen Jin, Hongrui Zhan, Shen Fu
December 10
- 💡 [SOSP'24] PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU
- 🙎♂️ Haiquan Wang, Jia He, Jiaqi Ruan
December 17
- 💡 [OSDI'24] Motor: Enabling Multi-Versioning for Distributed Transactions on Disaggregated Memory
- 🙎♂️ Sen Han
December 24
- 💡 [OSDI'23] Effectively Scheduling Computational Graphs of Deep Neural Networks toward Their Domain-Specific Accelerators
- 🙎♂️ Shiyi Wang, Jingbo Su
December 31
- 💡
- 🙎♂️ Yinhe Chen
January 7
- 💡 [OSDI'24] DistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving
- 🙎♂️ Mingxuan Liu (NPU, Northwestern Polytechnical University)