4 results found Sort:

A curated reading list of research in Adaptive Computation, Inference-Time Computation & Mixture of Experts (MoE).
Created 2023-08-17
137 commits to main branch, last one 18 days ago
9
55
apache-2.0
11
[Preprint] Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models
Created 2024-05-17
12 commits to main branch, last one 4 months ago
Exploration into the proposed "Self Reasoning Tokens" by Felipe Bonetto
Created 2024-05-02
12 commits to main branch, last one 7 months ago
Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount of time on any token
Created 2023-10-18
20 commits to main branch, last one about a year ago