5 results found Sort:
Annotations of the interesting ML papers I read
Created
2021-04-18
123 commits to main branch, last one 10 days ago
Super-Efficient RLHF Training of LLMs with Parameter Reallocation
Created
2024-06-18
1,071 commits to main branch, last one a day ago
Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*
Created
2023-06-14
554 commits to main branch, last one 11 months ago
Odysseus: Playground of LLM Sequence Parallelism
Created
2024-06-04
50 commits to main branch, last one 5 months ago
A LLaMA1/LLaMA12 Megatron implement.
Created
2023-06-26
10 commits to main branch, last one 11 months ago