2 results found Sort:
Root Mean Square Layer Normalization
Created
2019-09-24
7 commits to master branch, last one about a year ago
Code for the paper "On the Expressivity Role of LayerNorm in Transformers' Attention" (Findings of ACL'2023)
Created
2023-05-03
4 commits to main branch, last one 2 months ago