Search Results - RepositoryStats

1 result found Sort:

259

3.7k

gpl-3.0

116

📖A curated list of Awesome LLM/VLM Inference Papers with codes: WINT8/4, Flash-Attention, Paged-Attention, MLA, PP/TP/SP/CP/EP Parallelism, Prefix Cache, Chunked-Prefill, etc. 🎉🎉

mla vllm deepseek flash-mla minimax-01 awesome-llm deepseek-r1 deepseek-v3 tensorrt-llm llm-inference flash-attention paged-attention flash-attention-3

Created 2023-08-27

458 commits to main branch, last one 17 days ago