Trending repositories for topic diffusion
Image editing is worth a single LoRA! 0.1% training data for fantastic image editing! Training released! Surpasses GPT-4o in ID persistence! Official ComfyUI workflow release! Only 4GB VRAM is enough ...
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
Official implementation for the paper "Full-Order Sampling-Based MPC for Torque-Level Locomotion Control via Diffusion-Style Annealing". DIAL-MPC is a novel sampling-based MPC framework for legged rob...
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
《李宏毅深度学习教程》(李宏毅老师推荐👍,苹果书🍎),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
[TMLR 2025🔥] A survey for the autoregressive models in vision.
🔥🔥 UNO: A Universal Customization Method for Both Single and Multi-Subject Conditioning
collection of diffusion model papers categorized by their subareas
A multi-model framework generating fully featured osu! beatmaps for all gamemodes from spectrogram inputs.
Seamlessly integrate state-of-the-art transformer models into robotics stacks
🔥 InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity
This is the project for 'Any2Caption', Interpreting Any Condition to Caption for Controllable Video Generation
Image editing is worth a single LoRA! 0.1% training data for fantastic image editing! Training released! Surpasses GPT-4o in ID persistence! Official ComfyUI workflow release! Only 4GB VRAM is enough ...
A multi-model framework generating fully featured osu! beatmaps for all gamemodes from spectrogram inputs.
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
Seamlessly integrate state-of-the-art transformer models into robotics stacks
Create chatbot and AI agent workflows with unified access.
Official implementation for the paper "Full-Order Sampling-Based MPC for Torque-Level Locomotion Control via Diffusion-Style Annealing". DIAL-MPC is a novel sampling-based MPC framework for legged rob...
Auto get diffusion nlp papers in Axriv. More papers Information can be found in another repository "Diffusion-LM-Papers".
📚A curated list of Awesome Diffusion Inference Papers with codes: Sampling, Caching, Multi-GPUs, etc. 🎉🎉
[TMLR 2025🔥] A survey for the autoregressive models in vision.
Official implementation of All Atom Diffusion Transformers (ICML 2025)
An open-source implementation of Regional Adaptive Sampling (RAS), a novel diffusion model sampling strategy that introduces regional variability in sampling steps
🔥🔥 UNO: A Universal Customization Method for Both Single and Multi-Subject Conditioning
🔥ICLR 2025 (Spotlight) One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation Using a Single Prompt
collection of diffusion model papers categorized by their subareas
Image editing is worth a single LoRA! 0.1% training data for fantastic image editing! Training released! Surpasses GPT-4o in ID persistence! Official ComfyUI workflow release! Only 4GB VRAM is enough ...
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.
Official implementation for the paper "Full-Order Sampling-Based MPC for Torque-Level Locomotion Control via Diffusion-Style Annealing". DIAL-MPC is a novel sampling-based MPC framework for legged rob...
《李宏毅深度学习教程》(李宏毅老师推荐👍,苹果书🍎),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
[CVPR 2024 - Oral, Best Paper Award Candidate] Marigold: Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation
🔥 InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity
[TMLR 2025🔥] A survey for the autoregressive models in vision.
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
🔥🔥 UNO: A Universal Customization Method for Both Single and Multi-Subject Conditioning
Diffusion Models in Medical Imaging (Published in Medical Image Analysis Journal)
collection of diffusion model papers categorized by their subareas
A multi-model framework generating fully featured osu! beatmaps for all gamemodes from spectrogram inputs.
Image editing is worth a single LoRA! 0.1% training data for fantastic image editing! Training released! Surpasses GPT-4o in ID persistence! Official ComfyUI workflow release! Only 4GB VRAM is enough ...
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
Official implementation for the paper "Full-Order Sampling-Based MPC for Torque-Level Locomotion Control via Diffusion-Style Annealing". DIAL-MPC is a novel sampling-based MPC framework for legged rob...
This is the project for 'Any2Caption', Interpreting Any Condition to Caption for Controllable Video Generation
A multi-model framework generating fully featured osu! beatmaps for all gamemodes from spectrogram inputs.
[TMLR 2025🔥] A survey for the autoregressive models in vision.
Official implementation of All Atom Diffusion Transformers (ICML 2025)
📚A curated list of Awesome Diffusion Inference Papers with codes: Sampling, Caching, Multi-GPUs, etc. 🎉🎉
An easy way to view the images and metadata generated by Stable Diffusion's Automatic1111 WebUI
Seamlessly integrate state-of-the-art transformer models into robotics stacks
Pytorch implementation of diffusion models on Lie Groups for 6D grasp pose generation https://sites.google.com/view/se3dif/home
🔥 Official ComfyUI native node for InfiniteYou with FLUX
ComfyUI nodes collection: better TAESD previews (including batch previews), improved HyperTile and Deep Shrink nodes
Create chatbot and AI agent workflows with unified access.
[CVPR 2025] FaithDiff for Classic Film Rejuvenation, Old Photo Revival, Social Media Restoration, Image Enhancement and AIGC Enhancement.
Image editing is worth a single LoRA! 0.1% training data for fantastic image editing! Training released! Surpasses GPT-4o in ID persistence! Official ComfyUI workflow release! Only 4GB VRAM is enough ...
Image editing is worth a single LoRA! 0.1% training data for fantastic image editing! Training released! Surpasses GPT-4o in ID persistence! Official ComfyUI workflow release! Only 4GB VRAM is enough ...
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.
🔥 InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity
🔥🔥 UNO: A Universal Customization Method for Both Single and Multi-Subject Conditioning
《李宏毅深度学习教程》(李宏毅老师推荐👍,苹果书🍎),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
🔥 Official ComfyUI native node for InfiniteYou with FLUX
[TMLR 2025🔥] A survey for the autoregressive models in vision.
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
[CVPR 2024 - Oral, Best Paper Award Candidate] Marigold: Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation
Official implementation for the paper "Full-Order Sampling-Based MPC for Torque-Level Locomotion Control via Diffusion-Style Annealing". DIAL-MPC is a novel sampling-based MPC framework for legged rob...
A multi-model framework generating fully featured osu! beatmaps for all gamemodes from spectrogram inputs.
Scalable and memory-optimized training of diffusion models
🔥 Official ComfyUI native node for InfiniteYou with FLUX
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
A multi-model framework generating fully featured osu! beatmaps for all gamemodes from spectrogram inputs.
🎬 3.7× faster video generation E2E 🖼️ 1.6× faster image generation E2E ⚡ ColumnSparseAttn 9.3× vs FlashAttn‑3 💨 ColumnSparseGEMM 2.5× vs cuBLAS
This is the project for 'Any2Caption', Interpreting Any Condition to Caption for Controllable Video Generation
[CVPR 2025] FaithDiff for Classic Film Rejuvenation, Old Photo Revival, Social Media Restoration, Image Enhancement and AIGC Enhancement.
[TMLR 2025🔥] A survey for the autoregressive models in vision.
🔥🔥 UNO: A Universal Customization Method for Both Single and Multi-Subject Conditioning
[CVPR 2025] Official code repository for "Pixel-level and Semantic-level Adjustable Super-resolution: A Dual-LoRA Approach"
Official implementation of All Atom Diffusion Transformers (ICML 2025)
📚A curated list of Awesome Diffusion Inference Papers with codes: Sampling, Caching, Multi-GPUs, etc. 🎉🎉
Official implementation for the paper "Full-Order Sampling-Based MPC for Torque-Level Locomotion Control via Diffusion-Style Annealing". DIAL-MPC is a novel sampling-based MPC framework for legged rob...
List of diffusion related active submissions on OpenReview for ICLR 2025.
Auto get diffusion nlp papers in Axriv. More papers Information can be found in another repository "Diffusion-LM-Papers".
Code for "Around the World in 80 Timesteps: A Generative Approach to Global Visual Geolocation"
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
🔥 InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
Image editing is worth a single LoRA! 0.1% training data for fantastic image editing! Training released! Surpasses GPT-4o in ID persistence! Official ComfyUI workflow release! Only 4GB VRAM is enough ...
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
🔥🔥 UNO: A Universal Customization Method for Both Single and Multi-Subject Conditioning
Official implementation for the paper "Full-Order Sampling-Based MPC for Torque-Level Locomotion Control via Diffusion-Style Annealing". DIAL-MPC is a novel sampling-based MPC framework for legged rob...
[CVPR 2025 Highlight🔥] Identity-Preserving Text-to-Video Generation by Frequency Decomposition
Official implementation for "RIFLEx: A Free Lunch for Length Extrapolation in Video Diffusion Transformers" (ICML 2025)
[TMLR 2025🔥] A survey for the autoregressive models in vision.
[WACV'25 Oral] Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think
Video-Inpaint-Anything: This is the inference code for our paper CoCoCo: Improving Text-Guided Video Inpainting for Better Consistency, Controllability and Compatibility.
Video Diffusion Alignment via Reward Gradients. We improve a variety of video diffusion models such as VideoCrafter, OpenSora, ModelScope and StableVideoDiffusion by finetuning them using various rewa...
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.
《李宏毅深度学习教程》(李宏毅老师推荐👍,苹果书🍎),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
🔥 InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
Image editing is worth a single LoRA! 0.1% training data for fantastic image editing! Training released! Surpasses GPT-4o in ID persistence! Official ComfyUI workflow release! Only 4GB VRAM is enough ...
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
Scalable and memory-optimized training of diffusion models
🔥🔥 UNO: A Universal Customization Method for Both Single and Multi-Subject Conditioning
Lumina-T2X is a unified framework for Text to Any Modality Generation
[CVPR 2024 - Oral, Best Paper Award Candidate] Marigold: Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation
MuseV: Infinite-length and High Fidelity Virtual Human Video Generation with Visual Conditioned Parallel Denoising
🔥 InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity
Official implementation for "RIFLEx: A Free Lunch for Length Extrapolation in Video Diffusion Transformers" (ICML 2025)
Official implementation for the paper "Full-Order Sampling-Based MPC for Torque-Level Locomotion Control via Diffusion-Style Annealing". DIAL-MPC is a novel sampling-based MPC framework for legged rob...
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
📚A curated list of Awesome Diffusion Inference Papers with codes: Sampling, Caching, Multi-GPUs, etc. 🎉🎉
Flux diffusion model implementation using quantized fp8 matmul & remaining layers use faster half precision accumulate, which is ~2x faster on consumer devices.
Official Code Release for [SIGGRAPH 2024] DiLightNet: Fine-grained Lighting Control for Diffusion-based Image Generation
Official implementation of All Atom Diffusion Transformers (ICML 2025)
🔥ICLR 2025 (Spotlight) One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation Using a Single Prompt
VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation
[RSS 2024] Code for "Multimodal Diffusion Transformer: Learning Versatile Behavior from Multimodal Goals" for CALVIN experiments with pre-trained weights
Official implementation for the paper "Model-based Diffusion for Trajectory Optimization". Model-based diffusion (MBD) is a novel diffusion-based trajectory optimization framework that employs a dynam...
Code for FreeScale, a tuning-free method for higher-resolution visual generation
The aim of this repository is to test and implement Flow-Matching-based models
Python library for solving reinforcement learning (RL) problems using generative models (e.g. Diffusion Models).
🔥 Official ComfyUI native node for InfiniteYou with FLUX