5 results found Sort:

Annotations of the interesting ML papers I read
Created 2021-04-18
136 commits to main branch, last one 6 days ago
13
221
apache-2.0
4
Super-Efficient RLHF Training of LLMs with Parameter Reallocation
Created 2024-06-18
1,078 commits to main branch, last one about a month ago
Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*
Created 2023-06-14
554 commits to main branch, last one about a year ago
Odysseus: Playground of LLM Sequence Parallelism
Created 2024-06-04
50 commits to main branch, last one 8 months ago
2
28
apache-2.0
1
A LLaMA1/LLaMA12 Megatron implement.
Created 2023-06-26
10 commits to main branch, last one about a year ago