42 results found Sort:

4.9k
32.3k
apache-2.0
263
A high-throughput and memory-efficient inference and serving engine for LLMs
Created 2023-02-09
3,879 commits to main branch, last one 9 hours ago
3.5k
11.9k
apache-2.0
377
Open deep learning compiler stack for cpu, gpu and specialized accelerators
Created 2016-10-12
12,753 commits to main branch, last one 6 days ago
861
9.6k
mit
127
NumPy & SciPy for GPU
Created 2016-11-01
29,401 commits to main branch, last one 23 hours ago
281
1.7k
apache-2.0
139
This repository has no description...
This repository has been archived (exclude archived)
Created 2016-06-23
359 commits to master branch, last one 6 years ago
520
1.5k
lgpl-3.0
47
A deep learning package for many-body potential energy representation and molecular dynamics
Created 2017-12-12
3,140 commits to master branch, last one 28 days ago
Large-scale LLM inference engine
Created 2023-06-23
906 commits to main branch, last one 17 hours ago
84
1.2k
apache-2.0
29
stdgpu: Efficient STL-like Data Structures on the GPU
Created 2019-08-16
566 commits to master branch, last one 25 days ago
67
441
mit
60
Dockerfiles for the various software layers defined in the ROCm software platform
Created 2016-02-05
197 commits to master branch, last one 4 months ago
74
358
mpl-2.0
22
Abstraction Library for Parallel Kernel Acceleration :llama:
Created 2014-11-05
3,052 commits to develop branch, last one 9 days ago
169
351
other
59
Next generation BLAS implementation for ROCm platform
Created 2015-10-08
5,457 commits to develop branch, last one 19 hours ago
Agenium Scale vectorization library for CPUs and GPUs
Created 2019-04-10
172 commits to master branch, last one 3 years ago
48
283
other
18
AMD GPU (ROCm) programming in Julia
Created 2020-07-02
1,141 commits to master branch, last one 11 days ago
52
280
apache-2.0
20
Kubernetes (k8s) device plugin to enable registration of AMD GPU to a container cluster
Created 2018-04-03
125 commits to master branch, last one 9 days ago
48
208
apache-2.0
31
AOMP is an open source Clang/LLVM based compiler with added support for the OpenMP® API on Radeon™ GPUs. Use this repository for releases, issues, documentation, packaging, and examples.
Created 2019-01-19
4,133 commits to aomp-dev branch, last one a day ago
28
197
bsd-3-clause
21
Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm
Created 2018-03-01
562 commits to master branch, last one 10 days ago
65
188
apache-2.0
14
Zero-knowledge template library
Created 2022-06-14
362 commits to main branch, last one 3 days ago
75
187
mit
25
MIVisionX toolkit is a set of comprehensive computer vision and machine intelligence libraries, utilities, and applications bundled into a single toolkit. AMD MIVisionX also delivers a highly optimize...
Created 2018-12-20
995 commits to develop branch, last one 4 days ago
85
182
other
53
Next generation FFT implementation for ROCm
Created 2016-03-03
2,020 commits to develop branch, last one a day ago
70
166
mit
47
ROCm Parallel Primitives
Created 2017-12-13
1,544 commits to develop branch, last one a day ago
16
161
mit
24
GPUFORT: S2S translation tool for CUDA Fortran and Fortran+X in the spirit of hipify
Created 2021-02-15
313 commits to main branch, last one 3 years ago
AUTOMATIC1111/stable-diffusion-webui for CUDA and ROCm on NixOS
Created 2022-12-31
17 commits to master branch, last one about a year ago
42
129
bsd-3-clause
17
Domain specific library for electronic structure calculations
Created 2015-10-16
8,842 commits to develop branch, last one 2 days ago
Stable Diffusion Docker image preconfigured for usage with AMD Radeon cards
Created 2022-08-29
15 commits to main branch, last one about a year ago
79
123
other
38
ROCm BLAS marshalling library
Created 2017-04-10
1,182 commits to develop branch, last one a day ago
Install guide of ROCm and Tensorflow on Ubuntu for the RX580
Created 2020-11-05
45 commits to main branch, last one 2 months ago
70
112
mit
50
RAND library for HIP programming language
Created 2017-07-31
1,357 commits to develop branch, last one 7 days ago
The Lightning plugin ecosystem provides fast quantum state-vector and tensor network simulators written in C++ for use with PennyLane.
Created 2020-07-06
678 commits to master branch, last one 21 hours ago
53
95
other
28
Next generation LAPACK implementation for ROCm platform
Created 2018-05-22
749 commits to develop branch, last one 2 days ago
AMD OpenCL userspace drivers for Fedora. Currently not working for fedora 37
This repository has been archived (exclude archived)
Created 2022-01-02
65 commits to master branch, last one about a year ago