Search Results - RepositoryStats

35

198

mit

4

Shush is an app that deploys a WhisperV3 model with Flash Attention v2 on Modal and makes requests to it via a NextJS app

modal whisper shadcn-ui transcription machine-learning flash-attention-2 huggingface-transformers

Created 2023-11-18

64 commits to main branch, last one 9 months ago

11

101

apache-2.0

4

Triton implementation of FlashAttention2 that adds Custom Masks.

triton attention triton-lang cuda-kernels deep-learning flash-attention flash-attention-2 attention-mechanism

Created 2024-07-20

18 commits to main branch, last one 7 months ago

1

36

unknown

3

Uses the powerful WhisperS2T and Ctranslate2 libraries to batch transcribe multiple files

transcr whispers2t ctranslate2 transcriber transcription audio-recorder audio-recording flash-attention-2 audio-transcribing audio-transcription

Created 2024-02-29

40 commits to main branch, last one 4 days ago

3

35

bsd-3-clause

1

Performance of the C++ interface of flash attention and flash attention v2 in large language model (LLM) inference scenarios.

gpu llm mha cuda nvidia cutlass inference tensor-core flash-attention flash-attention-2 large-language-model multi-head-attention

Created 2023-08-16

1 commits to master branch, last one 19 days ago