22 results found Sort:
- Filter by Primary Language:
- Python (10)
- Cuda (4)
- Jupyter Notebook (2)
- MLIR (1)
- HTML (1)
- C++ (1)
- C (1)
- +
AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术
Created
2022-09-22
1,431 commits to main branch, last one 8 days ago
🚀 Awesome System for Machine Learning ⚡️ AI System Papers and Industry Practice. ⚡️ System for Machine Learning, LLM (Large Language Model), GenAI (Generative AI). 🍻 OSDI, NSDI, SIGCOMM, SoCC, MLSys...
Created
2019-01-07
509 commits to master branch, last one 7 months ago
Quantized Attention that achieves speedups of 2.1-3.1x and 2.7-5.1x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.
Created
2024-10-03
86 commits to main branch, last one 9 days ago
[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
Created
2024-11-06
160 commits to main branch, last one 9 hours ago
Distributed RL System for LLM Reasoning
Created
2025-02-24
129 commits to main branch, last one 4 days ago
A model compilation solution for various hardware
Created
2023-02-06
416 commits to main branch, last one 2 days ago
SpargeAttention: A training-free sparse attention that can accelerate any model inference.
Created
2025-02-25
32 commits to main branch, last one 2 days ago
FedScale is a scalable and extensible open-source federated learning (FL) platform.
Created
2021-04-01
707 commits to master branch, last one about a year ago
Measure and optimize the energy consumption of your AI applications!
Created
2022-08-13
333 commits to master branch, last one 5 days ago
Machine Learning Framework for Operating Systems - Brings ML to Linux kernel
Created
2021-11-10
28 commits to main branch, last one 3 years ago
An acceleration library that supports arbitrary bit-width combinatorial quantization operations
Created
2024-08-23
57 commits to main branch, last one 6 months ago
A scalable & efficient active learning/data selection system for everyone.
Created
2022-05-15
110 commits to main branch, last one 9 months ago
The repository has collected a batch of noteworthy MLSys bloggers (Algorithms/Systems)
Created
2025-01-05
10 commits to master branch, last one 3 months ago
📚FFPA(Split-D): Yet another Faster Flash Prefill Attention with O(1) SRAM complexity large headdim (D > 256), ~2x↑🎉vs SDPA EA.
Created
2024-11-29
247 commits to main branch, last one 12 days ago
A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems
Created
2024-01-30
24 commits to main branch, last one 5 months ago
Optimal Sparse Decision Trees
Created
2019-06-05
28 commits to master branch, last one about a year ago
Materials for my 2021 NYU class on NLP and ML Systems (Master of Engineering).
Created
2021-09-10
42 commits to main branch, last one 2 years ago
Federated Learning Systems Paper List
Created
2022-04-25
57 commits to main branch, last one about a year ago
sensAI: ConvNets Decomposition via Class Parallelism for Fast Inference on Live Data
Created
2019-03-05
163 commits to master branch, last one 4 years ago
NAACL '24 (Best Demo Paper RunnerUp) / MlSys @ NeurIPS '23 - RedCoast: A Lightweight Tool to Automate Distributed Training and Inference
Created
2022-12-19
674 commits to master branch, last one 3 months ago
[ICLR 2025] TidalDecode: A Fast and Accurate LLM Decoding with Position Persistent Sparse Attention
Created
2024-07-16
56 commits to main branch, last one 5 months ago
GraphSnapShot: Caching Local Structure for Fast Graph Learning [Efficient ML System]
Created
2023-10-30
241 commits to main branch, last one 4 months ago