1 result found Sort:

Implementation of Switch Transformers from the paper: "Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity"
Created 2024-01-22
11 commits to main branch, last one 8 months ago