41 results found Sort:

4.5k
29.7k
apache-2.0
242
A high-throughput and memory-efficient inference and serving engine for LLMs
Created 2023-02-09
3,290 commits to main branch, last one 7 hours ago
3.5k
11.8k
apache-2.0
375
Open deep learning compiler stack for cpu, gpu and specialized accelerators
Created 2016-10-12
12,722 commits to main branch, last one 2 days ago
849
9.4k
mit
128
NumPy & SciPy for GPU
Created 2016-11-01
29,264 commits to main branch, last one a day ago
281
1.7k
apache-2.0
139
This repository has no description...
This repository has been archived (exclude archived)
Created 2016-06-23
359 commits to master branch, last one 6 years ago
510
1.5k
lgpl-3.0
47
A deep learning package for many-body potential energy representation and molecular dynamics
Created 2017-12-12
2,534 commits to r2 branch, last one about a month ago
83
1.2k
apache-2.0
29
stdgpu: Efficient STL-like Data Structures on the GPU
Created 2019-08-16
550 commits to master branch, last one 15 days ago
Large-scale LLM inference engine
Created 2023-06-23
801 commits to main branch, last one 2 days ago
65
430
mit
59
Dockerfiles for the various software layers defined in the ROCm software platform
Created 2016-02-05
197 commits to master branch, last one 2 months ago
74
356
mpl-2.0
22
Abstraction Library for Parallel Kernel Acceleration :llama:
Created 2014-11-05
3,040 commits to develop branch, last one 27 days ago
165
345
other
59
Next generation BLAS implementation for ROCm platform
Created 2015-10-08
5,405 commits to develop branch, last one a day ago
Agenium Scale vectorization library for CPUs and GPUs
Created 2019-04-10
172 commits to master branch, last one 3 years ago
47
282
other
16
AMD GPU (ROCm) programming in Julia
Created 2020-07-02
1,122 commits to master branch, last one 20 days ago
47
271
apache-2.0
20
Kubernetes (k8s) device plugin to enable registration of AMD GPU to a container cluster
Created 2018-04-03
116 commits to master branch, last one 16 days ago
46
206
apache-2.0
31
AOMP is an open source Clang/LLVM based compiler with added support for the OpenMP® API on Radeon™ GPUs. Use this repository for releases, issues, documentation, packaging, and examples.
Created 2019-01-19
4,032 commits to aomp-dev branch, last one a day ago
27
192
bsd-3-clause
21
Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm
Created 2018-03-01
560 commits to master branch, last one 13 days ago
73
185
mit
24
MIVisionX toolkit is a set of comprehensive computer vision and machine intelligence libraries, utilities, and applications bundled into a single toolkit. AMD MIVisionX also delivers a highly optimize...
Created 2018-12-20
975 commits to develop branch, last one 8 days ago
64
183
apache-2.0
14
Zero-knowledge template library
Created 2022-06-14
350 commits to main branch, last one a day ago
84
174
other
53
Next generation FFT implementation for ROCm
Created 2016-03-03
1,993 commits to develop branch, last one a day ago
16
162
mit
24
GPUFORT: S2S translation tool for CUDA Fortran and Fortran+X in the spirit of hipify
Created 2021-02-15
313 commits to main branch, last one 3 years ago
69
161
mit
47
ROCm Parallel Primitives
Created 2017-12-13
1,527 commits to develop branch, last one a day ago
AUTOMATIC1111/stable-diffusion-webui for CUDA and ROCm on NixOS
Created 2022-12-31
17 commits to master branch, last one 11 months ago
40
127
bsd-3-clause
17
Domain specific library for electronic structure calculations
Created 2015-10-16
8,834 commits to develop branch, last one 21 hours ago
Stable Diffusion Docker image preconfigured for usage with AMD Radeon cards
Created 2022-08-29
15 commits to main branch, last one about a year ago
78
118
other
38
ROCm BLAS marshalling library
Created 2017-04-10
1,170 commits to develop branch, last one 2 days ago
Install guide of ROCm and Tensorflow on Ubuntu for the RX580
Created 2020-11-05
45 commits to main branch, last one about a month ago
69
110
mit
50
RAND library for HIP programming language
Created 2017-07-31
1,346 commits to develop branch, last one 15 days ago
AMD OpenCL userspace drivers for Fedora. Currently not working for fedora 37
This repository has been archived (exclude archived)
Created 2022-01-02
65 commits to master branch, last one about a year ago
51
91
other
28
Next generation LAPACK implementation for ROCm platform
Created 2018-05-22
735 commits to develop branch, last one a day ago
The PennyLane-Lightning plugin provides a fast state-vector simulator written in C++ for use with PennyLane
Created 2020-07-06
634 commits to master branch, last one 2 days ago