Trending repositories for topic gpu
Productive, portable, and performant GPU programming in Python.
Tensors and Dynamic neural networks in Python with strong GPU acceleration
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Lightweight Armoury Crate alternative for Asus laptops and ROG Ally. Control tool for ROG Zephyrus G14, G15, G16, M16, Flow X13, Flow X16, TUF, Strix, Scar and other models
This is the release repository for Fan Control, a highly customizable fan controlling software for Windows.
High-Performance Cross-Platform Monte Carlo Renderer Based on LuisaCompute
High-Performance Rendering Framework on Stream Architectures
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
John the Ripper jumbo - advanced offline password cracker, which supports hundreds of hash and cipher types, and runs on many operating systems, CPUs, GPUs, and even some FPGAs
An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.
High-Performance Cross-Platform Monte Carlo Renderer Based on LuisaCompute
High-Performance Rendering Framework on Stream Architectures
Productive, portable, and performant GPU programming in Python.
A fast communication-overlapping library for tensor parallelism on GPUs.
GLIM: versatile and extensible range-based 3D localization and mapping framework
This repository contains a Vulkan Framework designed to enable developers to get up and running quickly for creating sample content and rapid prototyping. It is designed to be easy to build and have t...
🦀⚙️ Sudoless performance monitoring for Apple Silicon processors. CPU / GPU / RAM usage, power consumption & temperature 🌡️
Xplace 2.0: An Extremely Fast, Extensible and Deterministic Placement Framework with Detailed-Routability Optimization
GPU switching without login out for Nvidia Optimus laptops under Linux
Extract embedded VBIOS from (almost) any BIOS Update. the hail-mary way
Productive, portable, and performant GPU programming in Python.
Tensors and Dynamic neural networks in Python with strong GPU acceleration
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Lightweight Armoury Crate alternative for Asus laptops and ROG Ally. Control tool for ROG Zephyrus G14, G15, G16, M16, Flow X13, Flow X16, TUF, Strix, Scar and other models
This is the release repository for Fan Control, a highly customizable fan controlling software for Windows.
High-Performance Rendering Framework on Stream Architectures
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
Open sourced tool for keeping NVIDIA GPUs updated, featuring fully customizable driver installs for complete control, multi-GPU support, and more!
The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs and CPUs via OpenCL. Free for non-commercial use.
John the Ripper jumbo - advanced offline password cracker, which supports hundreds of hash and cipher types, and runs on many operating systems, CPUs, GPUs, and even some FPGAs
BioNeMo Framework: For building and adapting AI models in drug discovery at scale
High-Performance Cross-Platform Monte Carlo Renderer Based on LuisaCompute
High-Performance Rendering Framework on Stream Architectures
🦀⚙️ Sudoless performance monitoring for Apple Silicon processors. CPU / GPU / RAM usage, power consumption & temperature 🌡️
Xplace 2.0: An Extremely Fast, Extensible and Deterministic Placement Framework with Detailed-Routability Optimization
Publish some small parts in my personal daily-used Houdini accessories
GLIM: versatile and extensible range-based 3D localization and mapping framework
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Productive, portable, and performant GPU programming in Python.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Lightweight Armoury Crate alternative for Asus laptops and ROG Ally. Control tool for ROG Zephyrus G14, G15, G16, M16, Flow X13, Flow X16, TUF, Strix, Scar and other models
This is the release repository for Fan Control, a highly customizable fan controlling software for Windows.
The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs and CPUs via OpenCL. Free for non-commercial use.
John the Ripper jumbo - advanced offline password cracker, which supports hundreds of hash and cipher types, and runs on many operating systems, CPUs, GPUs, and even some FPGAs
GPU & Accelerator process monitoring for AMD, Apple, Huawei, Intel, NVIDIA and Qualcomm
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
Run serverless GPU workloads with fast cold starts on bare-metal servers, anywhere in the world
A code for fast, massively-parallel of two-phase flows with heat transfer
BioNeMo Framework: For building and adapting AI models in drug discovery at scale
🦀⚙️ Sudoless performance monitoring for Apple Silicon processors. CPU / GPU / RAM usage, power consumption & temperature 🌡️
Run serverless GPU workloads with fast cold starts on bare-metal servers, anywhere in the world
Recipes for reproducing training and serving benchmarks for large machine learning models using GPUs on Google Cloud.
📖A curated list of Awesome Diffusion Inference Papers with codes, such as Sampling, Caching, Multi-GPUs, etc. 🎉🎉
Information and links about Epic's Unreal Engine including Verse programming language for UEFN, Unreal, Fortnite and the Metaverse along with UE5 and the UE6 convergence
NVIDIA-accelerated packages for arm motion planning and control
TypeScript library that enhances the WebGPU API, allowing resource management in a type-safe, declarative way.
GLIM: versatile and extensible range-based 3D localization and mapping framework
Multi-platform high-performance compute language extension for Rust.
🦀⚙️ Sudoless performance monitoring for Apple Silicon processors. CPU / GPU / RAM usage, power consumption & temperature 🌡️
Go library for embedded vector search and semantic embeddings using llama.cpp
Best practices & guides on how to write distributed pytorch training code
JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).
A fast communication-overlapping library for tensor parallelism on GPUs.
This project collects GPU benchmarks from various cloud providers and compares them to fixed per token costs. Use our tool for efficient LLM GPU selections and cost-effective AI models. LLM provider p...
☁️ VRAM for SDXL, AnimateDiff, and upscalers. Run your workflows on the cloud, from your local ComfyUI
NviWatch: A blazingly fast rust based TUI for managing and monitoring NVIDIA GPU processes
TypeScript library that enhances the WebGPU API, allowing resource management in a type-safe, declarative way.
Transforms your CasADi functions into batchable JAX-compatible functions. By combining the power of CasADi with the flexibility of JAX, JAXADi enables the creation of efficient code that runs smoothly...
📖A curated list of Awesome Diffusion Inference Papers with codes, such as Sampling, Caching, Multi-GPUs, etc. 🎉🎉
Tensors and Dynamic neural networks in Python with strong GPU acceleration
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Lightweight Armoury Crate alternative for Asus laptops and ROG Ally. Control tool for ROG Zephyrus G14, G15, G16, M16, Flow X13, Flow X16, TUF, Strix, Scar and other models
This is the release repository for Fan Control, a highly customizable fan controlling software for Windows.
GPU & Accelerator process monitoring for AMD, Apple, Huawei, Intel, NVIDIA and Qualcomm
SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 12+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such ...
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
John the Ripper jumbo - advanced offline password cracker, which supports hundreds of hash and cipher types, and runs on many operating systems, CPUs, GPUs, and even some FPGAs
Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python with AI-powered optimization proposals
Real-time image and video processing library similar to GPUImage, with built-in beauty filters, achieving commercial-grade beauty effects. Written in C++11 and based on OpenGL/ES.
A 3D FPGA GPU for real-time rasterization with a tile-based deferred rendering (TBDR) architecture, featuring transform & lighting (T&L), back-face culling, MSAA anti-aliasing, ordered dithering, etc.
Go library for embedded vector search and semantic embeddings using llama.cpp
Transforms your CasADi functions into batchable JAX-compatible functions. By combining the power of CasADi with the flexibility of JAX, JAXADi enables the creation of efficient code that runs smoothly...
☁️ VRAM for SDXL, AnimateDiff, and upscalers. Run your workflows on the cloud, from your local ComfyUI
An innovative library for efficient LLM inference via low-bit quantization
PhantomFHE: A CUDA-Accelerated Homomorphic Encryption Library
NviWatch: A blazingly fast rust based TUI for managing and monitoring NVIDIA GPU processes
GLIM: versatile and extensible range-based 3D localization and mapping framework
Information and links about Epic's Unreal Engine including Verse programming language for UEFN, Unreal, Fortnite and the Metaverse along with UE5 and the UE6 convergence
This is my experiments with BVH build algorithms on GPU.
A collection of GTSAM factors and optimizers for point cloud SLAM
RAG (Retrieval-augmented generation) ChatBot that provides answers based on contextual information extracted from a collection of Markdown files.
Collection of best practices, reference architectures, model training examples and utilities to train large models on AWS.