4 results found Sort:
A curated reading list of research in Adaptive Computation, Inference-Time Computation & Mixture of Experts (MoE).
Created
2023-08-17
139 commits to main branch, last one 29 days ago
[ICLR 2025] Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models
Created
2024-05-17
17 commits to main branch, last one 6 days ago
Exploration into the proposed "Self Reasoning Tokens" by Felipe Bonetto
Created
2024-05-02
12 commits to main branch, last one 8 months ago
Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount of time on any token
Created
2023-10-18
20 commits to main branch, last one about a year ago