3 results found Sort:
[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
Created
2023-06-12
50 commits to main branch, last one 7 months ago
[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
Created
2024-01-31
12 commits to main branch, last one 4 months ago
OpenSSA: Small Specialist Agents based on Domain-Aware Neurosymbolic Agent (DANA) architecture for industrial problem-solving
Created
2023-06-26
3,033 commits to main branch, last one 12 days ago