3 results found Sort:

Implementation of Block Recurrent Transformer - Pytorch
Created 2023-02-07
65 commits to main branch, last one 2 months ago
Data and code for our paper "Why Does the Effective Context Length of LLMs Fall Short?"
Created 2024-10-24
19 commits to main branch, last one 2 days ago