4 results found Sort:

Annotations of the interesting ML papers I read
Created 2021-04-18
96 commits to main branch, last one 6 days ago
Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*
Created 2023-06-14
554 commits to main branch, last one 6 months ago
Odysseus: Playground of LLM Sequence Parallelism
Created 2024-06-04
50 commits to main branch, last one 12 days ago
2
26
apache-2.0
1
A LLaMA1/LLaMA12 Megatron implement.
Created 2023-06-26
10 commits to main branch, last one 6 months ago