23 results found Sort:

505
4.8k
apache-2.0
181
The Forge Cross-Platform Rendering Framework PC Windows, Steamdeck (native), Ray Tracing, macOS / iOS, Android, XBOX, PS4, PS5, Switch, Quest 2
Created 2017-10-03
516 commits to master branch, last one about a month ago
369
1.6k
apache-2.0
93
Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
This repository has been archived (exclude archived)
Created 2017-09-08
1,695 commits to master branch, last one 4 years ago
Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and TIMM models.
Created 2020-06-01
98 commits to master branch, last one 26 days ago
Multi-threaded GUI manager for mass creation of AI-generated art with support for multiple GPUs.
Created 2022-09-13
378 commits to main branch, last one 3 months ago
GPU-ready Dockerfile to run Stability.AI stable-diffusion model v2 with a simple web interface. Includes multi-GPUs support.
Created 2022-08-25
89 commits to master branch, last one 5 months ago
38
323
bsd-3-clause
9
Package for writing high-level code for parallel high-performance stencil computations that can be deployed on both GPUs and CPUs
Created 2020-12-22
1,074 commits to main branch, last one 20 days ago
100
294
other
58
QUDA is a library for performing calculations in lattice QCD on GPUs.
Created 2011-01-27
14,443 commits to develop branch, last one 2 days ago
A PyTorch implementation of the 'FaceNet' paper for training a facial recognition model with Triplet Loss using the glint360k dataset. A pre-trained model using Triplet Loss is available for download.
Created 2019-03-21
267 commits to master branch, last one 3 years ago
Distributed tensors and Machine Learning framework with GPU and MPI acceleration in Python
Created 2018-05-17
7,160 commits to main branch, last one 8 days ago
Efficient Distributed GPU Programming for Exascale, an SC/ISC Tutorial
Created 2021-09-23
283 commits to main branch, last one 3 days ago
Chains stable-diffusion-webui instances together to facilitate faster image generation.
Created 2023-03-06
247 commits to master branch, last one 25 days ago
54
173
apache-2.0
9
multi-gpu pre-training in one machine for BERT from scratch without horovod (Data Parallelism)
Created 2018-12-25
99 commits to master branch, last one about a month ago
66
167
unknown
37
The world's first CUDA implementation of Weakly-Compressible Smoothed Particle Hydrodynamics
Created 2014-05-28
5,350 commits to master branch, last one 3 years ago
19
164
bsd-3-clause
14
Almost trivial distributed parallelization of stencil-based GPU and CPU applications on a regular staggered grid
Created 2019-12-06
375 commits to master branch, last one 8 days ago
High-level C++ for Accelerator Clusters
Created 2019-07-16
966 commits to master branch, last one a day ago
Efficient and Scalable Physics-Informed Deep Learning and Scientific Machine Learning on top of Tensorflow for multi-worker distributed computing
Created 2020-10-03
309 commits to main branch, last one 2 years ago
Multi-device OpenCL kernel load balancer and pipeliner API for C#. Uses shared-distributed memory model to keep GPUs updated fast while using same kernel on all devices(for simplicity).
Created 2017-03-29
403 commits to master branch, last one 2 years ago
A dual-GPU DEM solver with complex grain geometry support
Created 2021-04-22
655 commits to main branch, last one 3 months ago
22
45
apache-2.0
10
POT3D: High Performance Potential Field Solver
Created 2021-01-21
67 commits to main branch, last one 2 months ago
Artifact for OSDI'23: MGG: Accelerating Graph Neural Networks with Fine-grained intra-kernel Communication-Computation Pipelining on Multi-GPU Platforms.
Created 2023-03-26
284 commits to GCP branch, last one 8 months ago