4 results found Sort:
Use PEFT or Full-parameter to finetune 500+ LLMs (Qwen2.5, InternLM3, GLM4, Llama3.3, Mistral, Yi1.5, Baichuan2, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, Qwen2-Audio, Llama3.2-Visio...
Created
2023-08-01
1,686 commits to main branch, last one 12 hours ago
Megatron was a telegram file management bot that helped a lot of users, specially movie channel managers to upload their files to telegram by just providing a link to it. The project initially started...
Created
2020-09-09
210 commits to master branch, last one 3 years ago
Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*
Created
2023-06-14
554 commits to main branch, last one about a year ago
A LLaMA1/LLaMA12 Megatron implement.
Created
2023-06-26
10 commits to main branch, last one about a year ago