34 results found Sort:
- Filter by Primary Language:
- Python (29)
- Jupyter Notebook (2)
- C++ (1)
- Rust (1)
- +
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Created
2020-01-23
2,332 commits to master branch, last one a day ago
Run Mixtral-8x7B models in Colab or consumer desktops
Created
2023-12-15
86 commits to master branch, last one 5 months ago
Decentralized deep learning in PyTorch. Built to train models on thousands of volunteers across the world.
Created
2020-02-27
575 commits to master branch, last one a day ago
Mixture-of-Experts for Large Vision-Language Models
Created
2023-12-14
226 commits to main branch, last one 29 days ago
PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538
Created
2019-07-19
30 commits to master branch, last one about a month ago
⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training
Created
2023-07-24
211 commits to main branch, last one 3 months ago
A TensorFlow Keras implementation of "Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts" (KDD 2018)
Created
2018-09-10
22 commits to master branch, last one 2 years ago
Tutel MoE: An Optimized Mixture-of-Experts Implementation
Created
2021-08-06
173 commits to main branch, last one 7 days ago
Surrogate Modeling Toolbox
Created
2016-11-08
1,489 commits to master branch, last one 15 days ago
A Pytorch implementation of Sparsely-Gated Mixture of Experts, for massively increasing the parameter count of language models
Created
2020-07-13
33 commits to master branch, last one 9 months ago
From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)
Created
2024-01-22
115 commits to main branch, last one 2 months ago
中文Mixtral混合专家大模型(Chinese Mixtral MoE LLMs)
Created
2024-01-11
31 commits to main branch, last one about a month ago
A library for easily merging multiple LLM experts, and efficiently train the merged LLM.
Created
2024-04-08
41 commits to main branch, last one 12 days ago
GMoE could be the next backbone model for many kinds of generalization task.
Created
2022-05-28
28 commits to main branch, last one about a year ago
Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch
Created
2023-03-26
67 commits to main branch, last one 9 days ago
Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).
Created
2023-12-26
116 commits to main branch, last one 3 months ago
Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch
Created
2023-08-04
28 commits to main branch, last one about a month ago
Fast Inference of MoE Models with CPU-GPU Orchestration
Created
2024-02-05
49 commits to main branch, last one about a month ago
A curated reading list of research in Adaptive Computation, Dynamic Compute & Mixture of Experts (MoE).
Created
2023-08-17
123 commits to main branch, last one 22 days ago
Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts
Created
2023-04-21
41 commits to main branch, last one 11 months ago
RealCompo: Balancing Realism and Compositionality Improves Text-to-Image Diffusion Models
Created
2024-02-20
15 commits to main branch, last one 18 days ago
[SIGIR'24] The official implementation code of MOELoRA.
Created
2023-10-19
20 commits to master branch, last one 17 days ago
Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*
Created
2023-06-14
554 commits to main branch, last one 6 months ago
Repository for "See More Details: Efficient Image Super-Resolution by Experts Mining", ICML 2024
Created
2024-02-05
30 commits to main branch, last one 12 days ago
PyTorch library for cost-effective, fast and easy serving of MoE models.
Created
2024-01-22
15 commits to main branch, last one 11 days ago
PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)
Created
2023-08-07
21 commits to main branch, last one 8 months ago
[ICLR 2024 Spotlight] Code for the paper "Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy"
Created
2023-09-30
24 commits to main branch, last one 7 days ago
The implementation of "Leeroo Orchestrator: Elevating LLMs Performance Through Model Integration"
Created
2024-01-23
19 commits to main branch, last one about a month ago
Efficient global optimization toolbox in Rust: bayesian optimization, mixture of gaussian processes, sampling methods
Created
2020-08-27
490 commits to master branch, last one 6 days ago
[ICML 2022] "Neural Implicit Dictionary via Mixture-of-Expert Training" by Peihao Wang, Zhiwen Fan, Tianlong Chen, Zhangyang Wang
Created
2022-07-08
11 commits to main branch, last one 5 months ago