Search Results - RepositoryStats

4.3k

37.7k

apache-2.0

347

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

gpu zero pytorch inference compression deep-learning data-parallelism machine-learning model-parallelism billion-parameters mixture-of-experts trillion-parameters pipeline-parallelism

Created 2020-01-23

2,741 commits to master branch, last one a day ago

mixtral-offloading dvmazur

233

2.3k

mit

28

Run Mixtral-8x7B models in Colab or consumer desktops

llm pytorch offloading google-colab quantization deep-learning colab-notebook language-model mixture-of-experts

Created 2023-12-15

86 commits to master branch, last one about a year ago

hivemind learning-at-home

183

2.1k

mit

55

Decentralized deep learning in PyTorch. Built to train models on thousands of volunteers across the world.

dht asyncio pytorch hivemind deep-learning neural-networks machine-learning mixture-of-experts distributed-systems volunteer-computing distributed-training asynchronous-programming

Created 2020-02-27

591 commits to master branch, last one 16 days ago

MoE-LLaVA PKU-YuanGroup

134

2.1k

apache-2.0

23

Mixture-of-Experts for Large Vision-Language Models

moe multi-modal mixture-of-experts large-vision-language-model

Created 2023-12-14

228 commits to main branch, last one 4 months ago

optillm codelion

165

2.1k

apache-2.0

23

Optimizing inference proxy for LLMs

Created 2024-08-22

543 commits to main branch, last one a day ago

mixture-of-experts davidmrau

109

1.1k

gpl-3.0

6

PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538

moe pytorch re-implementation mixture-of-experts sparsely-gated-mixture-of-experts

Created 2019-07-19

30 commits to master branch, last one 11 months ago

Aria rhymes-ai

86

1.0k

apache-2.0

20

Codebase for Aria - an Open Multimodal Native MoE

multimodal mixture-of-experts vision-and-language

Created 2024-09-29

207 commits to main branch, last one 2 months ago

llama-moe pjlab-sys4nlp

55

945

apache-2.0

8

⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)

llm moe llama expert-partition mixture-of-experts continual-pre-training

Created 2023-07-24

212 commits to main branch, last one 9 months ago

Tutel microsoft

95

791

mit

16

Tutel MoE: An Optimized Mixture-of-Experts Implementation

llm moe pytorch deepseek deepseek-r1 mixture-of-experts

Created 2021-08-06

202 commits to main branch, last one 8 hours ago

smt SMTorg

215

747

bsd-3-clause

27

Surrogate Modeling Toolbox

sampling derivative multi-fidelity machine-learning surrogate-models mixture-of-experts predictive-modeling

Created 2016-11-08

1,585 commits to master branch, last one 3 days ago

mixture-of-experts lucidrains

55

723

mit

7

A Pytorch implementation of Sparsely-Gated Mixture of Experts, for massively increasing the parameter count of language models

transformer deep-learning mixture-of-experts artificial-intelligence

Created 2020-07-13

33 commits to master branch, last one about a year ago

keras-mmoe drawbridge

224

707

mit

14

A TensorFlow Keras implementation of "Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts" (KDD 2018)

keras kdd2018 tensorflow data-science deep-learning machine-learning mixture-of-experts multi-task-learning deep-neural-networks

Created 2018-09-10

22 commits to master branch, last one 3 years ago

makeMoE AviSoori1x

73

685

mit

8

From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)

llm pytorch deep-learning neural-networks mixture-of-experts large-language-models pytorch-implementation

Created 2024-01-22

116 commits to main branch, last one 5 months ago

Chinese-Mixtral ymcui

44

602

apache-2.0

15

中文Mixtral混合专家大模型（Chinese Mixtral MoE LLMs）

32k 64k llm moe nlp mixtral mixture-of-experts large-language-models

Created 2024-01-11

31 commits to main branch, last one 11 months ago

mergoo Leeroo-AI

30

460

lgpl-3.0

5

A library for easily merging multiple LLM experts, and efficiently train the merged LLM.

llm nlp lora merge fine-tuning multi-model open-source transformers generative-ai mixture-of-experts mixture-of-adapters large-language-models artificial-intelligence

Created 2024-04-08

43 commits to main branch, last one 7 months ago

st-moe-pytorch lucidrains

28

325

mit

5

Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch

deep-learning mixture-of-experts artificial-intelligence conditional-computation

Created 2023-03-26

68 commits to main branch, last one 9 months ago

soft-moe-pytorch lucidrains

8

274

mit

11

Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch

transformers deep-learning mixture-of-experts artificial-intelligence

Created 2023-08-04

29 commits to main branch, last one 23 hours ago

Generalizable-Mixture-of-Experts Luodian

35

269

mit

12

GMoE could be the next backbone model for many kinds of generalization task.

pytorch deep-learning mixture-of-experts domain-generalization pytorch-implementation

Created 2022-05-28

28 commits to main branch, last one 2 years ago

inferflow inferflow

25

239

mit

8

Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).

moe qwen bloom gemma phi-2 falcon llama2 m2m100 minicpm mistral mixtral deepseek internlm llamacpp baichuan2 llm-inference mixture-of-experts model-quantization multi-gpu-inference

Created 2023-12-26

116 commits to main branch, last one about a year ago