2 results found Sort:

12
215
bsd-3-clause
4
Root Mean Square Layer Normalization
Created 2019-09-24
7 commits to master branch, last one about a year ago
Code for the paper "On the Expressivity Role of LayerNorm in Transformers' Attention" (Findings of ACL'2023)
Created 2023-05-03
4 commits to main branch, last one 2 months ago