2 results found Sort:
⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x.
Created
2021-03-11
38 commits to master branch, last one 2 years ago
SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator
Created
2024-12-11
65 commits to main branch, last one about a month ago