23 results found Sort:

477
4.6k
apache-2.0
178
The Forge Cross-Platform Rendering Framework PC Windows, Steamdeck (native), Ray Tracing, macOS / iOS, Android, XBOX, PS4, PS5, Switch, Quest 2
Created 2017-10-03
498 commits to master branch, last one 22 days ago
371
1.5k
apache-2.0
93
Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
This repository has been archived (exclude archived)
Created 2017-09-08
1,695 commits to master branch, last one 3 years ago
Multi-threaded GUI manager for mass creation of AI-generated art with support for multiple GPUs.
Created 2022-09-13
370 commits to main branch, last one 2 months ago
Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and TIMM models.
Created 2020-06-01
96 commits to master branch, last one 2 months ago
GPU-ready Dockerfile to run Stability.AI stable-diffusion model v2 with a simple web interface. Includes multi-GPUs support.
Created 2022-08-25
87 commits to master branch, last one 7 months ago
31
291
bsd-3-clause
10
Package for writing high-level code for parallel high-performance stencil computations that can be deployed on both GPUs and CPUs
Created 2020-12-22
883 commits to main branch, last one 8 days ago
92
279
other
57
QUDA is a library for performing calculations in lattice QCD on GPUs.
Created 2011-01-27
13,864 commits to develop branch, last one 8 days ago
A PyTorch implementation of the 'FaceNet' paper for training a facial recognition model with Triplet Loss using the glint360k dataset. A pre-trained model using Triplet Loss is available for download.
Created 2019-03-21
267 commits to master branch, last one 2 years ago
Distributed tensors and Machine Learning framework with GPU and MPI acceleration in Python
Created 2018-05-17
6,636 commits to main branch, last one 21 hours ago
54
172
apache-2.0
9
multi-gpu pre-training in one machine for BERT from scratch without horovod (Data Parallelism)
Created 2018-12-25
98 commits to master branch, last one 4 months ago
Chains stable-diffusion-webui instances together to facilitate faster image generation.
Created 2023-03-06
216 commits to master branch, last one 15 days ago
68
156
unknown
37
The world's first CUDA implementation of Weakly-Compressible Smoothed Particle Hydrodynamics
Created 2014-05-28
5,350 commits to master branch, last one 3 years ago
16
154
bsd-3-clause
14
Almost trivial distributed parallelization of stencil-based GPU and CPU applications on a regular staggered grid
Created 2019-12-06
338 commits to master branch, last one 4 months ago
Efficient Distributed GPU Programming for Exascale, an SC/ISC Tutorial
Created 2021-09-23
240 commits to main branch, last one 3 days ago
High-level C++ for Accelerator Clusters
Created 2019-07-16
850 commits to master branch, last one 10 days ago
Efficient and Scalable Physics-Informed Deep Learning and Scientific Machine Learning on top of Tensorflow for multi-worker distributed computing
Created 2020-10-03
309 commits to main branch, last one 2 years ago
Multi-device OpenCL kernel load balancer and pipeliner API for C#. Uses shared-distributed memory model to keep GPUs updated fast while using same kernel on all devices(for simplicity).
Created 2017-03-29
403 commits to master branch, last one about a year ago
A dual-GPU DEM solver with complex grain geometry support
Created 2021-04-22
650 commits to main branch, last one 8 days ago
22
41
apache-2.0
10
POT3D: High Performance Potential Field Solver
Created 2021-01-21
61 commits to main branch, last one 4 days ago
Artifact for OSDI'23: MGG: Accelerating Graph Neural Networks with Fine-grained intra-kernel Communication-Computation Pipelining on Multi-GPU Platforms.
Created 2023-03-26
284 commits to GCP branch, last one 2 months ago