3 results found Sort:

Implementation of Block Recurrent Transformer - Pytorch
Created 2023-02-07
65 commits to main branch, last one 6 months ago
[ICLR'25] Data and code for our paper "Why Does the Effective Context Length of LLMs Fall Short?"
Created 2024-10-24
22 commits to main branch, last one 2 months ago