32 results found Sort:

4.3k
34.6k
apache-2.0
214
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
Created 2023-05-28
2,377 commits to main branch, last one a day ago
518
6.1k
apache-2.0
58
SGLang is a fast serving framework for large language models and vision language models.
Created 2024-01-08
1,267 commits to main branch, last one a day ago
135
3.8k
mit
23
:electron: An unofficial https://bgm.tv ui first app client for Android and iOS, built with React Native. 一个无广告、以爱好为驱动、不以盈利为目的、专门做 ACG 的类似豆瓣的追番记录,bgm.tv 第三方客户端。为移动端重新设计,内置大量加强的网页端难以实现的功能,且提供了相当的自定义选项。...
Created 2019-05-08
2,664 commits to master branch, last one 8 days ago
128
2.0k
apache-2.0
24
Mixture-of-Experts for Large Vision-Language Models
Created 2023-12-14
226 commits to main branch, last one 6 months ago
PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538
Created 2019-07-19
30 commits to master branch, last one 7 months ago
46
883
apache-2.0
8
⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)
Created 2023-07-24
212 commits to main branch, last one 4 months ago
80
765
apache-2.0
8
A toolkit for inference and evaluation of 'mixtral-8x7b-32kseqlen' from Mistral AI
Created 2023-12-09
54 commits to main branch, last one 11 months ago
64
761
apache-2.0
7
Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models
Created 2022-09-01
29 commits to main branch, last one 4 months ago
93
735
mit
15
Tutel MoE: An Optimized Mixture-of-Experts Implementation
Created 2021-08-06
181 commits to main branch, last one 3 days ago
43
584
apache-2.0
15
中文Mixtral混合专家大模型(Chinese Mixtral MoE LLMs)
Created 2024-01-11
31 commits to main branch, last one 6 months ago
MindSpore online courses: Step into LLM
Created 2023-03-21
126 commits to master branch, last one 28 days ago
😘 A pinterest-style layout site, shows illusts on pixiv.net order by popularity.
Created 2016-09-05
405 commits to master branch, last one 2 years ago
Official LISTEN.moe Android app
Created 2017-03-26
1,733 commits to main branch, last one 4 days ago
Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).
Created 2023-12-26
116 commits to main branch, last one 8 months ago
83
224
apache-2.0
38
A libGDX cross-platform API for InApp purchasing.
Created 2014-10-06
538 commits to master branch, last one 3 months ago
12
216
apache-2.0
11
ModuleFormer is a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward experts. We released a collection of ModuleFormer-based Language M...
Created 2023-08-24
19 commits to main branch, last one 10 months ago
5
157
apache-2.0
3
MoH: Multi-Head Attention as Mixture-of-Head Attention
Created 2024-10-08
19 commits to main branch, last one 22 days ago
3
140
apache-2.0
2
MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts
Created 2024-10-08
11 commits to main branch, last one about a month ago
Official LISTEN.moe Desktop Client
Created 2018-11-28
125 commits to master branch, last one 4 years ago
🚀 Easy, open-source LLM finetuning with one-line commands, seamless cloud integration, and popular optimization frameworks. ✨
Created 2024-07-05
189 commits to main branch, last one 3 months ago
Implementation of MoE Mamba from the paper: "MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts" in Pytorch and Zeta
Created 2024-01-21
14 commits to main branch, last one 10 months ago
Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*
Created 2023-06-14
554 commits to main branch, last one 11 months ago
15
73
unlicense
7
Batch download high quality videos from https://twist.moe
This repository has been archived (exclude archived)
Created 2018-01-06
150 commits to master branch, last one about a year ago
japReader is an app for breaking down Japanese sentences and tracking vocabulary progress
Created 2022-07-25
260 commits to master branch, last one 6 months ago
Implementation of Switch Transformers from the paper: "Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity"
Created 2024-01-22
11 commits to main branch, last one 10 months ago
9
50
apache-2.0
11
[Preprint] Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models
Created 2024-05-17
12 commits to main branch, last one 3 months ago
This repository has no description...
Created 2022-12-13
39 commits to main branch, last one about a year ago
[ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal, Shiwei Liu, Zhangyang Wang
Created 2023-02-18
2 commits to main branch, last one about a year ago
pytorch open-source library for the paper "AdaTT Adaptive Task-to-Task Fusion Network for Multitask Learning in Recommendations"
Created 2023-06-03
3 commits to main branch, last one 3 months ago
Inference framework for MoE layers based on TensorRT with Python binding
Created 2021-05-31
44 commits to master branch, last one 3 years ago