Bruce-Lee-LY / decoding_attention

Decoding Attention is specially optimized for multi head attention (MHA) using CUDA core for the decoding stage of LLM inference.

Date Created 2024-08-14 (6 months ago)
Commits 1 (last one 3 months ago)
Stargazers 29 (0 this week)
Watchers 2 (0 this week)
Forks 2
License bsd-3-clause
Ranking

RepositoryStats indexes 616,861 repositories, of these Bruce-Lee-LY/decoding_attention is ranked #592,670 (4th percentile) for total stargazers, and #493,533 for total watchers. Github reports the primary language for this repository as C++, for repositories using this language it is ranked #31,929/32,971.

Bruce-Lee-LY/decoding_attention is also tagged with popular topics, for these it's ranked: llm (#3,011/3266),  gpu (#929/946),  cuda (#661/678),  nvidia (#319/325),  inference (#309/319)

Star History

Github stargazers over time

303025252020151510105500Sep '24Sep '24Oct '24Oct '24Nov '24Nov '24Dec '24Dec '2420252025Feb '25Feb '25

Watcher History

Github watchers over time, collection started in '23

33332222221111Dec '24Dec '2410 Dec10 Dec20 Dec20 DecJan '25Jan '2510 Jan10 Jan20 Jan20 JanFeb '25Feb '2510 Feb10 Feb

Recent Commit History

1 commits on the default branch (master) since jan '22

1111110.50.500000015 Nov15 NovDec '24Dec '2415 Dec15 DecJan '25Jan '2515 Jan15 JanFeb '25Feb '2515 Feb15 Feb

Yearly Commits

Commits to the default branch (master) per year

1111110.50.500000020242024

Issue History

No issues have been posted

Languages

The primary language is C++ but there's also others...

C++C++PythonPythonShellShellCMakeCMakeCudaCudaCC

updated: 2025-02-17 @ 10:18am, id: 842468267 / R_kgDOMjcLqw