1 result found Sort:

📚FFPA(Split-D): Yet another Faster Flash Prefill Attention with O(1) GPU SRAM complexity for headdim > 256, ~2x↑🎉vs SDPA EA.
Created 2024-11-29
246 commits to main branch, last one 7 days ago