24 results found Sort:

400
5.5k
apache-2.0
49
Official release of InternLM2 7B and 20B base and chat models. 200K context support
Created 2023-07-06
220 commits to main branch, last one 11 days ago
257
2.5k
apache-2.0
13
Code and documents of LongLoRA and LongAlpaca (ICLR 2024 Oral)
Created 2023-09-21
171 commits to main branch, last one 5 months ago
Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Pytorch
Created 2023-05-15
28 commits to main branch, last one about a month ago
43
558
apache-2.0
5
Transformers with Arbitrarily Large Context
Created 2023-06-01
47 commits to main branch, last one 7 days ago
34
530
mit
6
LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding
Created 2023-07-29
59 commits to main branch, last one 4 months ago
Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch
Created 2024-02-14
196 commits to main branch, last one 2 months ago
Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch
Created 2023-04-24
67 commits to main branch, last one 4 months ago
PyTorch implementation of Infini-Transformer from "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" (https://arxiv.org/abs/2404.07143)
Created 2024-04-15
81 commits to main branch, last one about a month ago
20
230
mit
15
The code of our paper "InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory"
Created 2024-03-03
41 commits to main branch, last one 2 months ago
Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718
Created 2023-11-22
68 commits to main branch, last one 16 days ago
9
149
apache-2.0
8
LongAlign: A Recipe for Long Context Alignment Encompassing Data, Training, and Evaluation
Created 2024-01-27
45 commits to main branch, last one 2 months ago
LongQLoRA: Extent Context Length of LLMs Efficiently
Created 2023-10-22
25 commits to master branch, last one 7 months ago
TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding
Created 2024-04-04
15 commits to main branch, last one 3 days ago
ACL 2024 | LooGLE: Long Context Evaluation for Long-Context Language Models
Created 2023-11-02
25 commits to main branch, last one 4 months ago
open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factuality
Created 2024-04-18
19 commits to main branch, last one 10 days ago
The official repo for "LLoCo: Learning Long Contexts Offline"
Created 2024-04-12
4 commits to main branch, last one 14 days ago
Implementation of Infini-Transformer in Pytorch
Created 2024-05-01
42 commits to main branch, last one about a month ago
Implementation of paper "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"
Created 2023-05-31
16 commits to main branch, last one 2 months ago
Implementation of Perceiver AR, Deepmind's new long-context attention network based on Perceiver architecture, in Pytorch
Created 2022-06-18
20 commits to main branch, last one about a year ago
awesome llm plaza: daily tracking all sorts of awesome topics of llm, e.g. llm for coding, robotics, reasoning, multimod etc.
Created 2023-11-12
165 commits to main branch, last one 21 hours ago
Counting-Stars (★)
Created 2024-03-13
179 commits to main branch, last one a day ago
My own attempt at a long context genomics model, leveraging recent advances in long context attention modeling (Flash Attention + other hierarchical methods)
Created 2023-05-18
10 commits to main branch, last one 12 months ago
4
46
unknown
0
This is the official implementation of the paper "Needle In A Multimodal Haystack"
Created 2024-06-05
40 commits to main branch, last one 9 days ago
The official implementation of "Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarks"
Created 2024-04-10
53 commits to main branch, last one 2 months ago