2 results found Sort:

6
126
apache-2.0
3
ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference
Created 2024-10-22
13 commits to main branch, last one 24 days ago
6
101
mit
5
The official implementation of the paper <MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression>
Created 2024-06-19
26 commits to master branch, last one 11 days ago