3 results found Sort:

[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
Created 2023-06-12
50 commits to main branch, last one 3 months ago
18
230
unknown
14
KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
Created 2024-01-31
11 commits to main branch, last one 17 days ago
27
172
other
17
OpenSSA: Small Specialist Agents—Enabling Efficient, Domain-Specific Planning + Reasoning for AI
Created 2023-06-26
2,442 commits to main branch, last one 21 days ago