8 results found Sort:

124
1.5k
other
37
ILGPU JIT Compiler for high-performance .Net GPU programs
Created 2017-01-08
2,168 commits to master branch, last one 5 months ago
row-major matmul optimization
Created 2018-10-28
158 commits to master branch, last one about a year ago
27
398
apache-2.0
12
Learning how to write "Less Slow" code in C++ 20, C 99, CUDA, PTX, & Assembly, from numerics & SIMD to coroutines, ranges, exception handling, networking and user-space IO
Created 2022-03-03
347 commits to main branch, last one 3 days ago
🔥🔥🔥 A collection of some awesome public CUDA, cuBLAS, cuDNN, CUTLASS, TensorRT, TensorRT-LLM, Triton, TVM, MLIR and High Performance Computing (HPC) projects.
Created 2023-02-23
27 commits to main branch, last one 5 days ago
CudaPAD is a PTX/SASS viewer for NVIDIA Cuda kernels and provides an on-the-fly view of the assembly.
Created 2015-06-14
28 commits to master branch, last one 2 years ago
18
72
lgpl-2.1
18
Free software file format parser for Avid ProTools sessions
Created 2015-07-15
357 commits to master branch, last one 2 years ago
A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.
Created 2023-01-11
8 commits to master branch, last one about a year ago
Energinets Model Testbench. Automate gridcompliance studies in PSCAD and Powerfactory.
Created 2023-01-04
276 commits to main branch, last one 10 days ago