59 results found Sort:

817
13.5k
cc-by-sa-4.0
121
Machine Learning Engineering Open Book
Created 2020-09-02
936 commits to master branch, last one 17 days ago
698
3.0k
other
133
Slurm: A Highly Scalable Workload Manager
Created 2011-06-20
67,523 commits to master branch, last one 16 hours ago
682
2.9k
apache-2.0
89
A DSL for data-driven computational pipelines
Created 2013-03-27
7,011 commits to master branch, last one a day ago
169
1.8k
mpl-2.0
13
dstack is an open-source alternative to Kubernetes and Slurm, designed to simplify GPU allocation and AI workload orchestration for ML teams across top clouds, on-prem clusters, and accelerators.
Created 2022-01-04
2,675 commits to master branch, last one 5 hours ago
Python 3.8+ toolbox for submitting jobs to Slurm
Created 2020-04-24
147 commits to main branch, last one 7 months ago
242
909
apache-2.0
54
A scalable, efficient, cross-platform (Linux/macOS) and easy-to-use workflow engine in pure Python.
Created 2015-03-30
6,212 commits to master branch, last one 21 hours ago
121
511
gpl-2.0
22
Python Interface to Slurm
Created 2011-11-20
756 commits to main branch, last one 2 months ago
102
408
gpl-3.0
34
Open source web interface for Slurm HPC & AI clusters
Created 2015-02-27
562 commits to main branch, last one 14 days ago
Best practices & guides on how to write distributed pytorch training code
Created 2024-07-31
271 commits to main branch, last one 2 months ago
A Slurm cluster using docker-compose
Created 2017-09-11
45 commits to main branch, last one 6 months ago
129
360
other
20
TorchX is a universal job launcher for PyTorch applications. TorchX is designed to have fast iteration time for training/research and support for E2E production ML pipelines when you're ready.
Created 2021-05-04
801 commits to main branch, last one a day ago
Lightweight fast function pipeline (DAG) creation in pure Python for scientific workflows 🕸️🧪
Created 2023-07-16
744 commits to main branch, last one 3 days ago
Create clusters of VMs on the cloud and configure them with Ansible.
Created 2013-03-20
2,187 commits to master branch, last one about a year ago
A scheduler for GPU/CPU tasks
Created 2020-10-29
233 commits to master branch, last one about a year ago
Simplify HPC and Batch workloads on Azure
This repository has been archived (exclude archived)
Created 2016-08-26
923 commits to master branch, last one 2 years ago
A Cross-Platform, Multi-Cloud High-Performance Computing Platform
Created 2023-10-15
2,444 commits to master branch, last one 3 months ago
Prometheus exporter for performance metrics from Slurm.
Created 2017-04-18
148 commits to master branch, last one 3 years ago
138
242
apache-2.0
18
An open-source toolkit for deploying and managing high performance clusters for HPC, AI, and data analytics workloads.
Created 2020-02-18
7,507 commits to main branch, last one a day ago
26
210
apache-2.0
9
Run Slurm in Kubernetes
Created 2024-06-04
1,394 commits to dev branch, last one 23 hours ago
31
180
other
3
SEML: Slurm Experiment Management Library
Created 2019-10-13
634 commits to master branch, last one 5 months ago
51
179
lgpl-3.0
9
Tools for computation on batch systems
Created 2015-10-26
930 commits to master branch, last one 2 years ago
TUI for the Slurm Workload Manager
Created 2023-01-29
68 commits to main branch, last one 7 months ago
A simple Snakemake profile for Slurm without --cluster-config
Created 2021-05-01
59 commits to main branch, last one 10 months ago
28
150
apache-2.0
7
R package to send function calls as jobs on LSF, SGE, Slurm, PBS/Torque, or each via SSH
Created 2016-06-18
1,255 commits to master branch, last one 5 days ago
77
126
other
16
A collection of various resources, examples, and executables for the general NREL HPC user community's benefit. Use the following website for accessing documentation.
Created 2019-01-07
511 commits to master branch, last one about a month ago
Funnel is a toolkit for distributed task execution via a simple, standard API.
Created 2017-02-03
1,435 commits to master branch, last one 2 days ago
39
105
gpl-3.0
3
Slurm-Mail is a drop in replacement for Slurm's e-mails to give users much more information about their jobs compared to the standard Slurm e-mails.
Created 2018-02-11
918 commits to main branch, last one 2 days ago
A template for starting reproducible Python machine-learning projects with hardware acceleration. Find an example at https://github.com/CLAIRE-Labo/no-representation-no-trust
Created 2022-10-15
80 commits to main branch, last one 13 days ago
Slurm Docker Container on CentOS 7
Created 2016-12-11
139 commits to main branch, last one about a year ago
11
85
mit
0
A Slurm dashboard for the terminal.
Created 2020-05-28
197 commits to main branch, last one 3 years ago