Statistics for topic cuda
RepositoryStats tracks 559,026 Github repositories, of these 607 are tagged with the cuda topic. The most common primary language for repositories using this topic is C++ (221). Other languages include: Python (142), Cuda (68), C (28), Jupyter Notebook (25), Rust (19), Dockerfile (11), Shell (11)
Stargazers over time for topic cuda
Most starred repositories for topic cuda (view more)
Trending repositories for topic cuda (view more)
A high-throughput and memory-efficient inference and serving engine for LLMs
SGLang is a fast serving framework for large language models and vision language models.
NviWatch: A blazingly fast rust based TUI for managing and monitoring NVIDIA GPU processes
An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.
NviWatch: A blazingly fast rust based TUI for managing and monitoring NVIDIA GPU processes
A throughput-oriented high-performance serving framework for LLMs
A Rust library integrated with ONNXRuntime, providing a collection of Computer Vison and Vision-Language models.
A fast communication-overlapping library for tensor parallelism on GPUs.
NviWatch: A blazingly fast rust based TUI for managing and monitoring NVIDIA GPU processes
A high-throughput and memory-efficient inference and serving engine for LLMs
SGLang is a fast serving framework for large language models and vision language models.
A throughput-oriented high-performance serving framework for LLMs
An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.
NviWatch: A blazingly fast rust based TUI for managing and monitoring NVIDIA GPU processes
A throughput-oriented high-performance serving framework for LLMs
A Rust library integrated with ONNXRuntime, providing a collection of Computer Vison and Vision-Language models.
🚀 你的YOLO部署神器。TensorRT Plugin、CUDA Kernel、CUDA Graphs三管齐下,享受闪电般的推理速度。| Your YOLO Deployment Powerhouse. With the synergy of TensorRT Plugins, CUDA Kernels, and CUDA Graphs, experience lightning-fast i...
NviWatch: A blazingly fast rust based TUI for managing and monitoring NVIDIA GPU processes
A high-throughput and memory-efficient inference and serving engine for LLMs
SGLang is a fast serving framework for large language models and vision language models.
A throughput-oriented high-performance serving framework for LLMs
Beginner's Guide to reComputer Jetson
NviWatch: A blazingly fast rust based TUI for managing and monitoring NVIDIA GPU processes
Install PyTorch distributions with computation backend auto-detection
Several optimization methods of half-precision general matrix vector multiplication (HGEMV) using CUDA core.
SGLang is a fast serving framework for large language models and vision language models.
Best inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.
Multi-platform high-performance compute language extension for Rust.
A high-throughput and memory-efficient inference and serving engine for LLMs
SGLang is a fast serving framework for large language models and vision language models.
🎉CUDA/C++ 笔记 / 技术博客: fp32、fp16/bf16、fp8/int8、flash_attn、sgemm、sgemv、warp/block reduce、dot prod、elementwise、softmax、layernorm、rmsnorm、hist etc.
SGLang is a fast serving framework for large language models and vision language models.