23 results found Sort:
- Filter by Primary Language:
- C++ (9)
- Cuda (8)
- Python (2)
- Rust (1)
- JavaScript (1)
- C# (1)
- Jupyter Notebook (1)
- +
A General-purpose Task-parallel Programming System using Modern C++
Created
2018-04-18
2,372 commits to master branch, last one a day ago
Sample codes for my CUDA programming book
Created
2019-05-03
919 commits to master branch, last one about a year ago
CUDA Core Compute Libraries
Created
2020-09-17
10,250 commits to main branch, last one 18 hours ago
🚀 你的YOLO部署神器。TensorRT Plugin、CUDA Kernel、CUDA Graphs三管齐下,享受闪电般的推理速度。| Your YOLO Deployment Powerhouse. With the synergy of TensorRT Plugins, CUDA Kernels, and CUDA Graphs, experience lightning-fast i...
Created
2024-01-28
197 commits to main branch, last one a day ago
Thin, unified, C++-flavored wrappers for the CUDA APIs
Created
2016-11-11
1,004 commits to master branch, last one 4 months ago
TinyChatEngine: On-Device LLM Inference Library
Created
2023-05-24
55 commits to main branch, last one 5 months ago
Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models
Created
2022-09-01
29 commits to main branch, last one 5 months ago
Safe rust wrapper around CUDA toolkit
Created
2022-09-16
272 commits to main branch, last one 4 days ago
A simple GPU hash table implemented in CUDA using lock free techniques
Created
2020-03-01
31 commits to master branch, last one about a year ago
A self-learning tutorail for CUDA High Performance Programing.
Created
2022-10-11
102 commits to develop branch, last one 4 days ago
LLM notes, including model inference, transformer model structure, and llm framework code analysis notes
Created
2024-09-18
174 commits to main branch, last one 23 hours ago
This is an archive of materials produced for an introductory class on CUDA programming at Stanford University in 2010
Created
2015-03-14
112 commits to master branch, last one 2 years ago
From zero to hero CUDA for accelerating maths and machine learning on GPU.
Created
2024-05-20
14 commits to main branch, last one 5 months ago
μ-Cuda, COVER THE LAST MILE OF CUDA. With features: intellisense-friendly, structured launch, automatic cuda graph generation and updating.
Created
2022-12-18
384 commits to mini20 branch, last one 26 days ago
An implementation of HIP that works on CPUs, across OSes.
Created
2020-08-28
177 commits to master branch, last one 9 months ago
CUDA kernel author's tools
Created
2019-02-18
201 commits to master branch, last one 4 years ago
CudaPAD is a PTX/SASS viewer for NVIDIA Cuda kernels and provides an on-the-fly view of the assembly.
Created
2015-06-14
28 commits to master branch, last one about a year ago
CUDA Guide
Created
2020-09-25
18 commits to master branch, last one 11 months ago
Install CUDA on Windows11 using WSL2
Created
2023-05-23
52 commits to main branch, last one about a year ago
Speed up image preprocess with cuda when handle image or tensorrt inference
Created
2023-05-29
49 commits to main branch, last one 3 days ago
YOLOv9 Tensorrt deployment acceleration,provide two implementation methods: C++and Python🔥🔥🔥
Created
2024-02-23
17 commits to master branch, last one 9 months ago
cuda编程学习入门
Created
2022-02-02
73 commits to main branch, last one 5 months ago
bilibili视频【CUDA 12.x 并行编程入门(C++版)】配套代码
Created
2024-01-21
9 commits to main branch, last one 4 months ago