3 results found Sort:

Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper
Created 2025-02-19
82 commits to main branch, last one 8 hours ago
6
150
apache-2.0
3
ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference
Created 2024-10-22
13 commits to main branch, last one 3 months ago
6
114
mit
5
The official implementation of the paper <MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression>
Created 2024-06-19
31 commits to master branch, last one 2 months ago