34 results found Sort:
- Filter by Primary Language:
- Python (23)
- C++ (3)
- TypeScript (2)
- Jupyter Notebook (1)
- Kotlin (1)
- Shell (1)
- Vue (1)
- Java (1)
- JavaScript (1)
- +
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
Created
2023-05-28
2,493 commits to main branch, last one 12 hours ago
SGLang is a fast serving framework for large language models and vision language models.
Created
2024-01-08
1,526 commits to main branch, last one 2 hours ago
:electron: An unofficial https://bgm.tv ui first app client for Android and iOS, built with React Native. 一个无广告、以爱好为驱动、不以盈利为目的、专门做 ACG 的类似豆瓣的追番记录,bgm.tv 第三方客户端。为移动端重新设计,内置大量加强的网页端难以实现的功能,且提供了相当的自定义选项。...
Created
2019-05-08
2,711 commits to master branch, last one 3 days ago
Mixture-of-Experts for Large Vision-Language Models
Created
2023-12-14
228 commits to main branch, last one 18 days ago
PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538
Created
2019-07-19
30 commits to master branch, last one 8 months ago
⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)
Created
2023-07-24
212 commits to main branch, last one 5 months ago
Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models
Created
2022-09-01
29 commits to main branch, last one 5 months ago
A toolkit for inference and evaluation of 'mixtral-8x7b-32kseqlen' from Mistral AI
Created
2023-12-09
54 commits to main branch, last one about a year ago
Tutel MoE: An Optimized Mixture-of-Experts Implementation
Created
2021-08-06
182 commits to main branch, last one 29 days ago
中文Mixtral混合专家大模型(Chinese Mixtral MoE LLMs)
Created
2024-01-11
31 commits to main branch, last one 7 months ago
MindSpore online courses: Step into LLM
Created
2023-03-21
126 commits to master branch, last one about a month ago
😘 A pinterest-style layout site, shows illusts on pixiv.net order by popularity.
Created
2016-09-05
405 commits to master branch, last one 2 years ago
Official LISTEN.moe Android app
Created
2017-03-26
1,747 commits to main branch, last one 6 days ago
Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).
Created
2023-12-26
116 commits to main branch, last one 9 months ago
A libGDX cross-platform API for InApp purchasing.
Created
2014-10-06
538 commits to master branch, last one 4 months ago
ModuleFormer is a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward experts. We released a collection of ModuleFormer-based Language M...
Created
2023-08-24
19 commits to main branch, last one 11 months ago
MoH: Multi-Head Attention as Mixture-of-Head Attention
Created
2024-10-08
19 commits to main branch, last one about a month ago
MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts
Created
2024-10-08
11 commits to main branch, last one 2 months ago
Official LISTEN.moe Desktop Client
Created
2018-11-28
125 commits to master branch, last one 4 years ago
Implementation of MoE Mamba from the paper: "MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts" in Pytorch and Zeta
Created
2024-01-21
14 commits to main branch, last one 11 months ago
🚀 Easy, open-source LLM finetuning with one-line commands, seamless cloud integration, and popular optimization frameworks. ✨
Created
2024-07-05
189 commits to main branch, last one 4 months ago
Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*
Created
2023-06-14
554 commits to main branch, last one about a year ago
Batch download high quality videos from https://twist.moe
This repository has been archived
(exclude archived)
Created
2018-01-06
150 commits to master branch, last one about a year ago
japReader is an app for breaking down Japanese sentences and tracking vocabulary progress
Created
2022-07-25
260 commits to master branch, last one 7 months ago
🚀LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training
Created
2024-11-26
272 commits to main branch, last one 18 days ago
Implementation of Switch Transformers from the paper: "Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity"
Created
2024-01-22
11 commits to main branch, last one 11 months ago
[Preprint] Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models
Created
2024-05-17
12 commits to main branch, last one 4 months ago
This repository has no description...
Created
2022-12-13
39 commits to main branch, last one about a year ago
[ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal, Shiwei Liu, Zhangyang Wang
Created
2023-02-18
2 commits to main branch, last one about a year ago
pytorch open-source library for the paper "AdaTT Adaptive Task-to-Task Fusion Network for Multitask Learning in Recommendations"
Created
2023-06-03
3 commits to main branch, last one 4 months ago