Search Results - RepositoryStats

native-sparse-attention-pytorch lucidrains

27

567

mit

5

Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper

attention deep-learning sparse-attention artificial-intelligence

Created 2025-02-19

209 commits to main branch, last one 2 days ago

SpargeAttn thu-ml

19

365

apache-2.0

5

SpargeAttention: A training-free sparse attention that can accelerate any model inference.

llm mlsys ai-infra attention quantization sageattention sparse-attention vision-transformer inference-acceleration

Created 2025-02-25

26 commits to main branch, last one 14 days ago

ShadowKV bytedance

7

153

apache-2.0

3

ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference

low-rank research cpu-offload long-context llm-inference high-throughput sparse-attention

Created 2024-10-22

13 commits to main branch, last one 5 months ago

native-sparse-attention-triton XunhaoLai

6

124

apache-2.0

4

Efficient triton implementation of Native Sparse Attention.

sparse-attention large-language-models natural-language-processing

Created 2025-02-24

54 commits to main branch, last one 3 days ago

MoA thu-nics

6

122

mit

6

The official implementation of the paper <MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression>

sparse-attention model-compression large-language-models

Created 2024-06-19

31 commits to master branch, last one 3 months ago

FlexPrefill bytedance

4

66

apache-2.0

1

Code for paper: [ICLR2025 Oral] FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference

research sparse-attention large-language-models natural-language-processing

Created 2025-02-18

8 commits to main branch, last one 18 hours ago