Microsoft at ASPLOS 2024: Advancing hardware and software for high-scale, secure, and efficient modern applications
Publication 3D Optical Data Storage of Glass for Sustainable Cloud Archival Storage (JSAP) Masaaki Sakakura, Andromachi Chatzieleftheriou, Ant Rowstron, Ariel Gomez Diaz, Austin Donnelly, Benn Thomsen, Burcu Canakci, Christos Gkantsidis, Daniel Cletheroe, David Sweeney, Erika B. Aranas, Hugh Williams, Ioan Stefanovici, James Clegg, Patrick Anderson, Pashmina Cameron, Rokas Drevinskas, Sergey Legtchenko, Charles Whittaker, Freddie Hong, Takashi Lawson, Tim Deegan, Richard Black, Valentin Kapitany, Stefan Winzeck The 72nd Japanese Society of Applied Physics Spring Meeting 2025 | March 2025 Project
Publication POD-Attention: Unlocking Full Prefill-Decode Overlap for Faster LLM Inference Aditya K Kamath, Ramya Prabhu, Jayashree Mohan, Simon Peter, Ramachandran Ramjee, Ashish Panwar Architectural Support for Programming Languages and Operating Systems (ASPLOS) 2025 | March 2025 Project
Publication DynamoLLM: Designing LLM Inference Clusters for Performance and Energy Efficiency Jovan Stojkovic, Chaojie Zhang, Íñigo Goiri, Josep Torrellas, Esha Choukse HPCA | March 2025 Project
Publication Advancing Mobile GUI Agents: A Verifier-Driven Approach to Practical Deployment Gaole Dai, Shiqi Jiang, Ting Cao, Yuanchun Li, Yuqing Yang, Rui Tan, Mo Li, Lili Qiu March 2025
Publication KnapsackLB: Enabling Performance-Aware Layer-4 Load Balancing Rohan Gandhi, Srinivas Narayana ACM CoNEXT | March 2025
Publication DORADD: Deterministic Parallel Execution in the Era of Microsecond-Scale Computing Zhengqing Liu, Musa Unal, Matthew J. Parkinson, Marios Kogias PPoPP ’25 | March 2025 Project
Publication FlashFFTStencil: Bridging Fast Fourier Transforms to Memory-Efficient Stencil Computations on Tensor Core Units Haozhi Han, Kun Li, Wei Cui, Donglin Bai, Yiwei Zhang, Liang Yuan, Yifeng Chen, Yunquan Zhang, Ting Cao, Mao Yang ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP) | March 2025
Publication vAttention: Dynamic Memory Management for Serving LLMs without PagedAttention Ramya Prabhu, Ajay Nayak, Jayashree Mohan, Ramachandran Ramjee, Ashish Panwar Architectural Support for Programming Languages and Operating Systems (ASPLOS) 2025 | March 2025 Github Project
Publication The future of the industrial AI edge is cellular Xenofon Foukas, Bozidar Radunovic HotMobile | February 2025 Project
Publication Fast, Transparent Filesystem Microkernel Recovery with Ananke Jing Liu, Yifan Dai, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau USENIX Conference on File and Storage Technologies | February 2025 Erik Riedel Best Paper Award