8 results found Sort:
- Filter by Primary Language:
- C++ (4)
- C# (2)
- Python (1)
- +
ILGPU JIT Compiler for high-performance .Net GPU programs
Created
2017-01-08
2,168 commits to master branch, last one 5 months ago
row-major matmul optimization
Created
2018-10-28
158 commits to master branch, last one about a year ago
Learning how to write "Less Slow" code in C++ 20, C 99, CUDA, PTX, & Assembly, from numerics & SIMD to coroutines, ranges, exception handling, networking and user-space IO
Created
2022-03-03
347 commits to main branch, last one 3 days ago
🔥🔥🔥 A collection of some awesome public CUDA, cuBLAS, cuDNN, CUTLASS, TensorRT, TensorRT-LLM, Triton, TVM, MLIR and High Performance Computing (HPC) projects.
Created
2023-02-23
27 commits to main branch, last one 5 days ago
CudaPAD is a PTX/SASS viewer for NVIDIA Cuda kernels and provides an on-the-fly view of the assembly.
Created
2015-06-14
28 commits to master branch, last one 2 years ago
Free software file format parser for Avid ProTools sessions
Created
2015-07-15
357 commits to master branch, last one 2 years ago
A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.
Created
2023-01-11
8 commits to master branch, last one about a year ago
Energinets Model Testbench. Automate gridcompliance studies in PSCAD and Powerfactory.
Created
2023-01-04
276 commits to main branch, last one 10 days ago