Bruce-Lee-LY / flash_attention_inference

Performance of the C++ interface of flash attention and flash attention v2 in large language model (LLM) inference scenarios.

Date Created 2023-08-16 (about a year ago)
Commits 1 (last one 25 days ago)
Stargazers 35 (0 this week)
Watchers 1 (0 this week)
Forks 3
License bsd-3-clause
Ranking

RepositoryStats indexes 630,443 repositories, of these Bruce-Lee-LY/flash_attention_inference is ranked #572,226 (9th percentile) for total stargazers, and #555,755 for total watchers. Github reports the primary language for this repository as C++, for repositories using this language it is ranked #31,129/33,653.

Bruce-Lee-LY/flash_attention_inference is also tagged with popular topics, for these it's ranked: llm (#3,020/3538),  gpu (#917/965),  cuda (#643/686),  nvidia (#318/331),  inference (#311/330)

Star History

Github stargazers over time

3535303025252020151510105500Sep '24Sep '24Oct '24Oct '24Nov '24Nov '24Dec '24Dec '2420252025Feb '25Feb '25Mar '25Mar '25

Watcher History

Github watchers over time, collection started in '23

2222111111000015 Nov15 NovDec '24Dec '2415 Dec15 DecJan '25Jan '2515 Jan15 JanFeb '25Feb '2515 Feb15 FebMar '25Mar '2515 Mar15 Mar

Recent Commit History

1 commits on the default branch (master) since jan '22

1111110.50.5000000Oct '24Oct '24Nov '24Nov '24Dec '24Dec '2420252025Feb '25Feb '25Mar '25Mar '25

Yearly Commits

Commits to the default branch (master) per year

1111110.50.500000020242024

Issue History

Total Issues
Open Issues
Closed Issues
443.53.5332.52.5221.51.5110.50.500Oct '23Oct '23Nov '23Nov '23Dec '23Dec '2320242024Feb '24Feb '24Mar '24Mar '24Apr '24Apr '24May '24May '24Jun '24Jun '24Jul '24Jul '24Aug '24Aug '24Sep '24Sep '24Oct '24Oct '24Nov '24Nov '24Dec '24Dec '2420252025Feb '25Feb '25Mar '25Mar '25

Languages

The primary language is C++ but there's also others...

C++C++ShellShellCCPythonPythonCudaCudaCMakeCMake

updated: 2025-03-19 @ 06:23am, id: 679281575 / R_kgDOKH0Dpw