41 results found Sort:

4.6k
30.5k
apache-2.0
248
A high-throughput and memory-efficient inference and serving engine for LLMs
Created 2023-02-09
3,511 commits to main branch, last one 13 hours ago
3.5k
11.8k
apache-2.0
376
Open deep learning compiler stack for cpu, gpu and specialized accelerators
Created 2016-10-12
12,740 commits to main branch, last one 2 days ago
855
9.5k
mit
128
NumPy & SciPy for GPU
Created 2016-11-01
29,305 commits to main branch, last one a day ago
281
1.7k
apache-2.0
139
This repository has no description...
This repository has been archived (exclude archived)
Created 2016-06-23
359 commits to master branch, last one 6 years ago
515
1.5k
lgpl-3.0
47
A deep learning package for many-body potential energy representation and molecular dynamics
Created 2017-12-12
2,543 commits to r2 branch, last one 2 days ago
83
1.2k
apache-2.0
29
stdgpu: Efficient STL-like Data Structures on the GPU
Created 2019-08-16
565 commits to master branch, last one a day ago
Large-scale LLM inference engine
Created 2023-06-23
825 commits to main branch, last one 21 hours ago
65
432
mit
60
Dockerfiles for the various software layers defined in the ROCm software platform
Created 2016-02-05
197 commits to master branch, last one 3 months ago
74
356
mpl-2.0
22
Abstraction Library for Parallel Kernel Acceleration :llama:
Created 2014-11-05
3,046 commits to develop branch, last one a day ago
167
346
other
59
Next generation BLAS implementation for ROCm platform
Created 2015-10-08
5,422 commits to develop branch, last one a day ago
Agenium Scale vectorization library for CPUs and GPUs
Created 2019-04-10
172 commits to master branch, last one 3 years ago
47
283
other
16
AMD GPU (ROCm) programming in Julia
Created 2020-07-02
1,129 commits to master branch, last one a day ago
47
274
apache-2.0
20
Kubernetes (k8s) device plugin to enable registration of AMD GPU to a container cluster
Created 2018-04-03
117 commits to master branch, last one 2 days ago
47
206
apache-2.0
31
AOMP is an open source Clang/LLVM based compiler with added support for the OpenMP® API on Radeon™ GPUs. Use this repository for releases, issues, documentation, packaging, and examples.
Created 2019-01-19
4,061 commits to aomp-dev branch, last one a day ago
27
196
bsd-3-clause
21
Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm
Created 2018-03-01
560 commits to master branch, last one 27 days ago
73
186
mit
25
MIVisionX toolkit is a set of comprehensive computer vision and machine intelligence libraries, utilities, and applications bundled into a single toolkit. AMD MIVisionX also delivers a highly optimize...
Created 2018-12-20
983 commits to develop branch, last one a day ago
65
185
apache-2.0
14
Zero-knowledge template library
Created 2022-06-14
350 commits to main branch, last one 15 days ago
84
176
other
53
Next generation FFT implementation for ROCm
Created 2016-03-03
2,001 commits to develop branch, last one 2 days ago
16
162
mit
24
GPUFORT: S2S translation tool for CUDA Fortran and Fortran+X in the spirit of hipify
Created 2021-02-15
313 commits to main branch, last one 3 years ago
69
162
mit
47
ROCm Parallel Primitives
Created 2017-12-13
1,539 commits to develop branch, last one 8 hours ago
AUTOMATIC1111/stable-diffusion-webui for CUDA and ROCm on NixOS
Created 2022-12-31
17 commits to master branch, last one about a year ago
41
127
bsd-3-clause
17
Domain specific library for electronic structure calculations
Created 2015-10-16
8,835 commits to develop branch, last one 13 days ago
Stable Diffusion Docker image preconfigured for usage with AMD Radeon cards
Created 2022-08-29
15 commits to main branch, last one about a year ago
78
121
other
38
ROCm BLAS marshalling library
Created 2017-04-10
1,175 commits to develop branch, last one a day ago
Install guide of ROCm and Tensorflow on Ubuntu for the RX580
Created 2020-11-05
45 commits to main branch, last one about a month ago
69
111
mit
50
RAND library for HIP programming language
Created 2017-07-31
1,352 commits to develop branch, last one 11 hours ago
53
95
other
28
Next generation LAPACK implementation for ROCm platform
Created 2018-05-22
742 commits to develop branch, last one 20 hours ago
The PennyLane-Lightning plugin provides a fast state-vector simulator written in C++ for use with PennyLane
Created 2020-07-06
649 commits to master branch, last one 19 hours ago
AMD OpenCL userspace drivers for Fedora. Currently not working for fedora 37
This repository has been archived (exclude archived)
Created 2022-01-02
65 commits to master branch, last one about a year ago