24 results found Sort:
- Filter by Primary Language:
- Python (22)
- Jupyter Notebook (1)
- +
Official release of InternLM2 7B and 20B base and chat models. 200K context support
Created
2023-07-06
220 commits to main branch, last one 11 days ago
Code and documents of LongLoRA and LongAlpaca (ICLR 2024 Oral)
Created
2023-09-21
171 commits to main branch, last one 5 months ago
Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Pytorch
Created
2023-05-15
28 commits to main branch, last one about a month ago
Transformers with Arbitrarily Large Context
Created
2023-06-01
47 commits to main branch, last one 7 days ago
LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding
Created
2023-07-29
59 commits to main branch, last one 4 months ago
Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch
Created
2024-02-14
196 commits to main branch, last one 2 months ago
Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch
Created
2023-04-24
67 commits to main branch, last one 4 months ago
PyTorch implementation of Infini-Transformer from "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" (https://arxiv.org/abs/2404.07143)
Created
2024-04-15
81 commits to main branch, last one about a month ago
The code of our paper "InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory"
Created
2024-03-03
41 commits to main branch, last one 2 months ago
Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718
Created
2023-11-22
68 commits to main branch, last one 16 days ago
LongAlign: A Recipe for Long Context Alignment Encompassing Data, Training, and Evaluation
Created
2024-01-27
45 commits to main branch, last one 2 months ago
LongQLoRA: Extent Context Length of LLMs Efficiently
Created
2023-10-22
25 commits to master branch, last one 7 months ago
TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding
Created
2024-04-04
15 commits to main branch, last one 3 days ago
ACL 2024 | LooGLE: Long Context Evaluation for Long-Context Language Models
Created
2023-11-02
25 commits to main branch, last one 4 months ago
open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factuality
Created
2024-04-18
19 commits to main branch, last one 10 days ago
The official repo for "LLoCo: Learning Long Contexts Offline"
Created
2024-04-12
4 commits to main branch, last one 14 days ago
Implementation of Infini-Transformer in Pytorch
Created
2024-05-01
42 commits to main branch, last one about a month ago
Implementation of paper "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"
Created
2023-05-31
16 commits to main branch, last one 2 months ago
Implementation of Perceiver AR, Deepmind's new long-context attention network based on Perceiver architecture, in Pytorch
Created
2022-06-18
20 commits to main branch, last one about a year ago
awesome llm plaza: daily tracking all sorts of awesome topics of llm, e.g. llm for coding, robotics, reasoning, multimod etc.
Created
2023-11-12
165 commits to main branch, last one 21 hours ago
Counting-Stars (★)
Created
2024-03-13
179 commits to main branch, last one a day ago
My own attempt at a long context genomics model, leveraging recent advances in long context attention modeling (Flash Attention + other hierarchical methods)
Created
2023-05-18
10 commits to main branch, last one 12 months ago
This is the official implementation of the paper "Needle In A Multimodal Haystack"
Created
2024-06-05
40 commits to main branch, last one 9 days ago
The official implementation of "Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarks"
Created
2024-04-10
53 commits to main branch, last one 2 months ago