Search Results - RepositoryStats

LLaMA-Factory hiyouga

5.6k

45.8k

apache-2.0

252

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Created 2023-05-28

2,738 commits to main branch, last one 19 hours ago

sglang sgl-project

1.4k

12.7k

apache-2.0

94

SGLang is a fast serving framework for large language models and vision language models.

llm moe vlm cuda llama llava llama3 pytorch deepseek llama3-1 inference deepseek-r1 deepseek-v3 llm-serving transformer deepseek-llm deepseek-r1-zero

Created 2024-01-08

2,655 commits to main branch, last one 7 hours ago

Bangumi czy0729

143

4.2k

mit

24

:electron: An unofficial https://bgm.tv ui first app client for Android and iOS, built with React Native. 一个无广告、以爱好为驱动、不以盈利为目的、专门做 ACG 的类似豆瓣的追番记录，bgm.tv 第三方客户端。为移动端重新设计，内置大量加强的网页端难以实现的功能，且提供了相当的自定义选项。...

ios moe expo mobx react design android bangumi ios-app android-app react-native

Created 2019-05-08

2,794 commits to master branch, last one a day ago

MoE-LLaVA PKU-YuanGroup

134

2.1k

apache-2.0

23

Mixture-of-Experts for Large Vision-Language Models

moe multi-modal mixture-of-experts large-vision-language-model

Created 2023-12-14

228 commits to main branch, last one 3 months ago

MoBA MoonshotAI

101

1.7k

mit

25

MoBA: Mixture of Block Attention for Long-Context LLMs

llm moe pytorch llm-serving transformer llm-training flash-attention

Created 2025-02-17

12 commits to master branch, last one 25 days ago

mixture-of-experts davidmrau

109

1.1k

gpl-3.0

6

PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538

moe pytorch re-implementation mixture-of-experts sparsely-gated-mixture-of-experts

Created 2019-07-19

30 commits to master branch, last one 11 months ago

llama-moe pjlab-sys4nlp

55

943

apache-2.0

8

⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)

llm moe llama expert-partition mixture-of-experts continual-pre-training

Created 2023-07-24

212 commits to main branch, last one 9 months ago

Tutel microsoft

95

790

mit

15

Tutel MoE: An Optimized Mixture-of-Experts Implementation

llm moe pytorch deepseek deepseek-r1 mixture-of-experts

Created 2021-08-06

197 commits to main branch, last one 7 days ago

Adan sail-sg

67

784

apache-2.0

7

Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models

Created 2022-09-01

29 commits to main branch, last one 9 months ago

MixtralKit open-compass

80

767

apache-2.0

8

A toolkit for inference and evaluation of 'mixtral-8x7b-32kseqlen' from Mistral AI

llm moe mistral

Created 2023-12-09

54 commits to main branch, last one about a year ago

Chinese-Mixtral ymcui

44

603

apache-2.0

15

中文Mixtral混合专家大模型（Chinese Mixtral MoE LLMs）

32k 64k llm moe nlp mixtral mixture-of-experts large-language-models

Created 2024-01-11

31 commits to main branch, last one 11 months ago

DeepSeek-671B-SFT-Guide ScienceOne-AI

72

571

apache-2.0

7

An open-source solution for full parameter fine-tuning of DeepSeek-V3/R1 671B, including complete code and scripts from training to inference, as well as some practical experiences and conclusions. (D...

llm moe sft python deepseek-r1

Created 2025-03-07

16 commits to main branch, last one 19 days ago

step_into_llm mindspore-courses

110

455

apache-2.0

8

MindSpore online courses: Step into LLM

Created 2023-03-21

131 commits to master branch, last one 2 months ago

pixiv.moe kokororin

43

363

mit

11

😘 A pinterest-style layout site, shows illusts on pixiv.net order by popularity.

moe comic pixiv react redux comics illust webapp illusts website lovelive typescript

Created 2016-09-05

405 commits to master branch, last one 3 years ago

android-app LISTEN-moe

25

260

mit

8

Official LISTEN.moe Android app

moe jpop kpop anime japan music kotlin android android-auto music-player

Created 2017-03-26

1,779 commits to main branch, last one 4 days ago

inferflow inferflow

25

239

mit

8

Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).

moe qwen bloom gemma phi-2 falcon llama2 m2m100 minicpm mistral mixtral deepseek internlm llamacpp baichuan2 llm-inference mixture-of-experts model-quantization multi-gpu-inference

Created 2023-12-26

116 commits to main branch, last one about a year ago

MoH SkyworkAI

9

229

apache-2.0

3

MoH: Multi-Head Attention as Mixture-of-Head Attention

dit moe vit llms attention transformer mixture-of-experts

Created 2024-10-08

19 commits to main branch, last one 5 months ago

gdx-pay libgdx

86

225

apache-2.0

36

A libGDX cross-platform API for InApp purchasing.

iap ios moe java libgdx robovm android gdx-pay in-app-purchase multi-os-engine

Created 2014-10-06

541 commits to master branch, last one 2 months ago

ModuleFormer IBM

11

217

apache-2.0

9

ModuleFormer is a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward experts. We released a collection of ModuleFormer-based Language M...

lm moe

Created 2023-08-24

19 commits to main branch, last one about a year ago

MoE-plus-plus SkyworkAI

6

198

apache-2.0

2

[ICLR 2025] MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts

moe llms mixture-of-experts large-language-models

Created 2024-10-08

11 commits to main branch, last one 5 months ago

Ling inclusionAI

12

121

mit

3

Ling is a MoE LLM provided and open-sourced by InclusionAI.

rl llm moe llm-reasoning machine-learning

Created 2025-02-19

37 commits to master branch, last one 3 days ago

MoE-Mamba kyegomez

5

101

mit

5

Implementation of MoE Mamba from the paper: "MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts" in Pytorch and Zeta

ai ml moe swarms multi-modality multi-modal-fusion

Created 2024-01-21

14 commits to main branch, last one about a year ago

desktop-app LISTEN-moe

9

97

unknown

3

Official LISTEN.moe Desktop Client

app moe jpop anime linux macos music client listen desktop windows

Created 2018-11-28

125 commits to master branch, last one 4 years ago

SwitchTransformers kyegomez

12

91

mit

2

Implementation of Switch Transformers from the paper: "Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity"

ai ml moe gpt4 llama multi-modal mixture-model mixture-of-models mixture-of-experts

Created 2024-01-22

11 commits to main branch, last one about a year ago

Simplifine simplifine-llm

4

89

gpl-3.0

2

🚀 Easy, open-source LLM finetuning with one-line commands, seamless cloud integration, and popular optimization frameworks. ✨

ai gpt llm moe phi lora peft qwen cloud llama llama3 mistral fine-tuning open-source llm-training fine-tuning-llm finetuning-llms instruction-tuning large-language-models

Created 2024-07-05

189 commits to main branch, last one 7 months ago

DynMoE LINs-lab

10

82

apache-2.0

11

[ICLR 2025] Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models

moe language-model mixture-of-experts adaptive-computation multimodal-large-language-models

Created 2024-05-17

18 commits to main branch, last one about a month ago

pipegoose xrsrke

18

81

mit

3

Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*

moe zero-1 megatron megatron-lm transformers 3d-parallelism data-parallelism model-parallelism mixture-of-experts tensor-parallelism pipeline-parallelism sequence-parallelism distributed-optimizers huggingface-transformers large-scale-language-modeling

Created 2023-06-14

554 commits to main branch, last one about a year ago

LLaMA-MoE-v2 OpenSparseLLMs

11

76

apache-2.0

2

🚀LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training

moe sft llama llama3 sparsity attention fine-tuning instruction-tuning mixture-of-experts

Created 2024-11-26

272 commits to main branch, last one 3 months ago

japReader marisukukise

6

75

gpl-3.0

4

japReader is an app for breaking down Japanese sentences and tracking vocabulary progress

moe anki ichi manga jmdict ichimoe electron furigana japanese flashcards javascript visual-novel japanese-study japanese-language language-learning progress-tracking japanese-dictionary language-learning-tool japanese-language-learners

Created 2022-07-25

265 commits to master branch, last one about a month ago

twist.moe phanirithvij

15

74

unlicense

6

Batch download high quality videos from https://twist.moe

moe anime twist-moe anime-downloader twist-moe-downloader

This repository has been archived (exclude archived)

Created 2018-01-06

150 commits to master branch, last one about a year ago