7 results found Sort:
- Filter by Primary Language:
- Python (5)
- C++ (1)
- Cuda (1)
- +
NumPy & SciPy for GPU
Created
2016-11-01
28,514 commits to main branch, last one a day ago
An open collection of implementation tips, tricks and resources for training large language models
Created
2023-03-06
26 commits to main branch, last one about a year ago
An open collection of methodologies to help with successful training of large language models.
Created
2023-03-08
18 commits to main branch, last one 10 months ago
Distributed and decentralized training framework for PyTorch over graph
Created
2019-12-03
1,094 commits to master branch, last one about a year ago
Federated Learning Utilities and Tools for Experimentation
Created
2021-11-17
54 commits to main branch, last one 9 months ago
Efficient Distributed GPU Programming for Exascale, an SC/ISC Tutorial
Created
2021-09-23
240 commits to main branch, last one 3 days ago
NCCL Fast Socket is a transport layer plugin to improve NCCL collective communication performance on Google Cloud.
Created
2021-07-19
20 commits to master branch, last one 6 months ago