3 results found Sort:
Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper
Created
2025-02-19
78 commits to main branch, last one 2 days ago
ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference
Created
2024-10-22
13 commits to main branch, last one 3 months ago
The official implementation of the paper <MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression>
Created
2024-06-19
31 commits to master branch, last one 2 months ago