Trending repositories for topic diffusion
🔥🔥 UNO: A Universal Customization Method for Both Single and Multi-Subject Conditioning
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
🔥 InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.
Memory-optimized training library for diffusion models
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
《李宏毅深度学习教程》(李宏毅老师推荐👍,苹果书🍎),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
An easy 1-click way to create beautiful artwork on your PC using AI, with no tech knowledge. Provides a browser UI for generating images from text prompts and images. Just enter your text prompt, and ...
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
Simple and readable code for training and sampling from diffusion models
This is the project for 'Any2Caption', Interpreting Any Condition to Caption for Controllable Video Generation
Auto get diffusion nlp papers in Axriv. More papers Information can be found in another repository "Diffusion-LM-Papers".
Video Diffusion Alignment via Reward Gradients. We improve a variety of video diffusion models such as VideoCrafter, OpenSora, ModelScope and StableVideoDiffusion by finetuning them using various rewa...
🔥🔥 UNO: A Universal Customization Method for Both Single and Multi-Subject Conditioning
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
This is the project for 'Any2Caption', Interpreting Any Condition to Caption for Controllable Video Generation
🔥 InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity
🔥ICLR 2025 (Spotlight) One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation Using a Single Prompt
Auto get diffusion nlp papers in Axriv. More papers Information can be found in another repository "Diffusion-LM-Papers".
Memory-optimized training library for diffusion models
Simple and readable code for training and sampling from diffusion models
Official implementation for the paper "Full-Order Sampling-Based MPC for Torque-Level Locomotion Control via Diffusion-Style Annealing". DIAL-MPC is a novel sampling-based MPC framework for legged rob...
Video Diffusion Alignment via Reward Gradients. We improve a variety of video diffusion models such as VideoCrafter, OpenSora, ModelScope and StableVideoDiffusion by finetuning them using various rewa...
[CVPR 2025] Official code repository for "Pixel-level and Semantic-level Adjustable Super-resolution: A Dual-LoRA Approach"
The official PyTorch code for RoHM: Robust Human Motion Reconstruction via Diffusion.
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
[CVPR 2025 Highlight🔥] Identity-Preserving Text-to-Video Generation by Frequency Decomposition
🔥🔥 UNO: A Universal Customization Method for Both Single and Multi-Subject Conditioning
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
🔥 InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
《李宏毅深度学习教程》(李宏毅老师推荐👍,苹果书🍎),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
Memory-optimized training library for diffusion models
[TMLR 2025🔥] A survey for the autoregressive models in vision.
Using Low-rank adaptation to quickly fine-tune diffusion models.
An easy 1-click way to create beautiful artwork on your PC using AI, with no tech knowledge. Provides a browser UI for generating images from text prompts and images. Just enter your text prompt, and ...
🔥🔥 UNO: A Universal Customization Method for Both Single and Multi-Subject Conditioning
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
This is the project for 'Any2Caption', Interpreting Any Condition to Caption for Controllable Video Generation
🔥 InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity
[CVPR 2025] Official code repository for "Pixel-level and Semantic-level Adjustable Super-resolution: A Dual-LoRA Approach"
Curated list of methods that focuses on improving the efficiency of diffusion models
VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation
🔥ICLR 2025 (Spotlight) One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation Using a Single Prompt
[TMLR 2025🔥] A survey for the autoregressive models in vision.
[CVPR 2025] FaithDiff for Classic Film Rejuvenation, Old Photo Revival, Social Media Restoration, Image Enhancement and AIGC Enhancement.
Unofficial implementation of "Simplifying, Stabilizing & Scaling Continuous-Time Consistency Models" for MNIST
Seamlessly integrate state-of-the-art transformer models into robotics stacks
Auto get diffusion nlp papers in Axriv. More papers Information can be found in another repository "Diffusion-LM-Papers".
Video Diffusion Alignment via Reward Gradients. We improve a variety of video diffusion models such as VideoCrafter, OpenSora, ModelScope and StableVideoDiffusion by finetuning them using various rewa...
[CVPR 2024] Official implementation for "SVGDreamer: Text Guided SVG Generation with Diffusion Model" https://arxiv.org/abs/2312.16476
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
🔥🔥 UNO: A Universal Customization Method for Both Single and Multi-Subject Conditioning
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
🔥 InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity
🔥🔥 UNO: A Universal Customization Method for Both Single and Multi-Subject Conditioning
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
《李宏毅深度学习教程》(李宏毅老师推荐👍,苹果书🍎),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
Memory-optimized training library for diffusion models
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
collection of diffusion model papers categorized by their subareas
[TMLR 2025🔥] A survey for the autoregressive models in vision.
Using Low-rank adaptation to quickly fine-tune diffusion models.
An easy 1-click way to create beautiful artwork on your PC using AI, with no tech knowledge. Provides a browser UI for generating images from text prompts and images. Just enter your text prompt, and ...
🔥 InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
This is the project for 'Any2Caption', Interpreting Any Condition to Caption for Controllable Video Generation
[CVPR 2025] FaithDiff for Classic Film Rejuvenation, Old Photo Revival, Social Media Restoration, Image Enhancement and AIGC Enhancement.
[CVPR 2025] Official code repository for "Pixel-level and Semantic-level Adjustable Super-resolution: A Dual-LoRA Approach"
Unofficial implementation of "Simplifying, Stabilizing & Scaling Continuous-Time Consistency Models" for MNIST
VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation
Janky implementation of DiffuseHigh for ComfyUI
[RSS 2024] Code for "Multimodal Diffusion Transformer: Learning Versatile Behavior from Multimodal Goals" for CALVIN experiments with pre-trained weights
[TMLR 2025🔥] A survey for the autoregressive models in vision.
Curated list of methods that focuses on improving the efficiency of diffusion models
Auto get diffusion nlp papers in Axriv. More papers Information can be found in another repository "Diffusion-LM-Papers".
[ICLR 2025] Code for the paper "Beyond Autoregression: Discrete Diffusion for Complex Reasoning and Planning"
ComfyUI nodes collection: better TAESD previews (including batch previews), improved HyperTile and Deep Shrink nodes
stable-diffusion.cpp bindings for python
A multi-model framework that generates fully featured osu! beatmaps for all gamemodes from spectrogram inputs.
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
🔥 InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
🔥🔥 UNO: A Universal Customization Method for Both Single and Multi-Subject Conditioning
[CVPR 2025 Highlight🔥] Identity-Preserving Text-to-Video Generation by Frequency Decomposition
Official implementation for the paper "Full-Order Sampling-Based MPC for Torque-Level Locomotion Control via Diffusion-Style Annealing". DIAL-MPC is a novel sampling-based MPC framework for legged rob...
[TMLR 2025🔥] A survey for the autoregressive models in vision.
[WACV'25 Oral] Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
Video-Inpaint-Anything: This is the inference code for our paper CoCoCo: Improving Text-Guided Video Inpainting for Better Consistency, Controllability and Compatibility.
Video Diffusion Alignment via Reward Gradients. We improve a variety of video diffusion models such as VideoCrafter, OpenSora, ModelScope and StableVideoDiffusion by finetuning them using various rewa...
Flux diffusion model implementation using quantized fp8 matmul & remaining layers use faster half precision accumulate, which is ~2x faster on consumer devices.
🔥ICLR 2025 (Spotlight) One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation Using a Single Prompt
Official implementation for the paper "Model-based Diffusion for Trajectory Optimization". Model-based diffusion (MBD) is a novel diffusion-based trajectory optimization framework that employs a dynam...
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.
《李宏毅深度学习教程》(李宏毅老师推荐👍,苹果书🍎),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
Lumina-T2X is a unified framework for Text to Any Modality Generation
🔥 InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
MuseV: Infinite-length and High Fidelity Virtual Human Video Generation with Visual Conditioned Parallel Denoising
Memory-optimized training library for diffusion models
[CVPR 2024 - Oral, Best Paper Award Candidate] Marigold: Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation
collection of diffusion model papers categorized by their subareas
An easy 1-click way to create beautiful artwork on your PC using AI, with no tech knowledge. Provides a browser UI for generating images from text prompts and images. Just enter your text prompt, and ...
🔥 InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity
Official implementation for the paper "Full-Order Sampling-Based MPC for Torque-Level Locomotion Control via Diffusion-Style Annealing". DIAL-MPC is a novel sampling-based MPC framework for legged rob...
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
📖A curated list of Awesome Diffusion Inference Papers with codes: Sampling, Caching, Multi-GPUs, etc. 🎉🎉
Flux diffusion model implementation using quantized fp8 matmul & remaining layers use faster half precision accumulate, which is ~2x faster on consumer devices.
Official Code Release for [SIGGRAPH 2024] DiLightNet: Fine-grained Lighting Control for Diffusion-based Image Generation
🔥ICLR 2025 (Spotlight) One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation Using a Single Prompt
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation
[RSS 2024] Code for "Multimodal Diffusion Transformer: Learning Versatile Behavior from Multimodal Goals" for CALVIN experiments with pre-trained weights
Code for FreeScale, a tuning-free method for higher-resolution visual generation
The aim of this repository is to test and implement Flow-Matching-based models
Janky implementation of HiDiffusion for ComfyUI
Open implementation of UNIVERSE and UNIVERSE++ diffusion-based speech enhancement models.
Code for "Around the World in 80 Timesteps: A Generative Approach to Global Visual Geolocation"