3 results found Sort:
🎉CUDA 笔记 / 大模型手撕CUDA / C++笔记,更新随缘: flash_attn、sgemm、sgemv、warp reduce、block reduce、dot product、elementwise、softmax、layernorm、rmsnorm、hist etc.
Created
2022-12-17
110 commits to main branch, last one a day ago
Root Mean Square Layer Normalization
Created
2019-09-24
7 commits to master branch, last one about a year ago
Code for the paper "On the Expressivity Role of LayerNorm in Transformers' Attention" (Findings of ACL'2023)
Created
2023-05-03
3 commits to main branch, last one 11 months ago