5 results found Sort:

Annotations of the interesting ML papers I read
Created 2021-04-18
123 commits to main branch, last one 10 days ago
5
120
apache-2.0
3
Super-Efficient RLHF Training of LLMs with Parameter Reallocation
Created 2024-06-18
1,071 commits to main branch, last one a day ago
Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*
Created 2023-06-14
554 commits to main branch, last one 11 months ago
Odysseus: Playground of LLM Sequence Parallelism
Created 2024-06-04
50 commits to main branch, last one 5 months ago
2
27
apache-2.0
1
A LLaMA1/LLaMA12 Megatron implement.
Created 2023-06-26
10 commits to main branch, last one 11 months ago