Trending repositories for topic diffusion-models
HunyuanVideo: A Systematic Framework For Large Video Generation Model
[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simp...
Implementation of papers in 100 lines of code.
SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
Official implementation of the paper “MagicDriveDiT: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control”
Code of Pyramidal Flow Matching for Efficient Video Generative Modeling
OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
A collection of resources and papers on Diffusion Models
Identity-Preserving Text-to-Video Generation by Frequency Decomposition
《Pytorch实用教程》(第二版)无论是零基础入门,还是CV、NLP、LLM项目应用,或是进阶工程化部署落地,在这里都有。相信在本书的帮助下,读者将能够轻松掌握 PyTorch 的使用,成为一名优秀的深度学习工程师。
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model. NeurIPS 2024 Spotlight.
SUPIR aims at developing Practical Algorithms for Photo-Realistic Image Restoration In the Wild. Our new online demo is also released at suppixel.ai.
Diffusion Models in Medical Imaging (Published in Medical Image Analysis Journal)
High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
Implementation of DiffDock: Diffusion Steps, Twists, and Turns for Molecular Docking
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
[ECCV 2024] The official implementation of paper "BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion"
Official implementation of the paper “MagicDriveDiT: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control”
HunyuanVideo: A Systematic Framework For Large Video Generation Model
Implementation of papers in 100 lines of code.
SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
A repository for organizing papers, codes and other resources related to Virtual Try-on Models
A diffusion model-based stereo depth estimation framework that can predict state-of-the-art depth and restore noisy depth maps for transparent and specular surfaces
[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simp...
[ICRA 2024] Language-Conditioned Affordance-Pose Detection in 3D Point Clouds
This repo implements Denoising Diffusion Probabilistic Models (DDPM) in Pytorch
[ECCV 2024] GaussCtrl: Multi-View Consistent Text-Driven 3D Gaussian Splatting Editing
Identity-Preserving Text-to-Video Generation by Frequency Decomposition
Official implementation for the paper "Model-based Diffusion for Trajectory Optimization". Model-based diffusion (MBD) is a novel diffusion-based trajectory optimization framework that employs a dynam...
Baking Gaussian Splatting into Diffusion Denoiser for Fast and Scalable Single-stage Image-to-3D Generation
The official implementation of the paper titled "StableV2V: Stablizing Shape Consistency in Video-to-Video Editing".
[NeurIPS 2023] A Dynamics-informed Diffusion Model for Spatiotemporal Forecasting
HunyuanVideo: A Systematic Framework For Large Video Generation Model
[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simp...
Implementation of papers in 100 lines of code.
SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
Identity-Preserving Text-to-Video Generation by Frequency Decomposition
Official implementation of the paper “MagicDriveDiT: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control”
Code of Pyramidal Flow Matching for Efficient Video Generative Modeling
OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.
XQ-GAN🚀: An Open-source Image Tokenization Framework for Autoregressive Generation
A collection of resources and papers on Diffusion Models
《Pytorch实用教程》(第二版)无论是零基础入门,还是CV、NLP、LLM项目应用,或是进阶工程化部署落地,在这里都有。相信在本书的帮助下,读者将能够轻松掌握 PyTorch 的使用,成为一名优秀的深度学习工程师。
Diffusion Models in Medical Imaging (Published in Medical Image Analysis Journal)
Official code of "Imagine360: Immersive 360 Video Generation from Perspective Anchor"
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
collection of diffusion model papers categorized by their subareas
High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
[ICLR24] Official implementation of the paper “MagicDrive: Street View Generation with Diverse 3D Geometry Control”
Official code of "Imagine360: Immersive 360 Video Generation from Perspective Anchor"
Implementation of papers in 100 lines of code.
HunyuanVideo: A Systematic Framework For Large Video Generation Model
Official implementation of the paper “MagicDriveDiT: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control”
SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
A diffusion model-based stereo depth estimation framework that can predict state-of-the-art depth and restore noisy depth maps for transparent and specular surfaces
XQ-GAN🚀: An Open-source Image Tokenization Framework for Autoregressive Generation
Official implementation of "Art-Free Generative Models: Art Creation Without Graphic Art Knowledge"
Identity-Preserving Text-to-Video Generation by Frequency Decomposition
📚 Collection of awesome generation acceleration resources.
[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simp...
A repository for organizing papers, codes and other resources related to Virtual Try-on Models
Inference-only implementation of "One-Step Diffusion Distillation through Score Implicit Matching" [NIPS 2024]
[ECCV 2024] RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models
This repo implements Denoising Diffusion Probabilistic Models (DDPM) in Pytorch
The official implementation of the paper titled "StableV2V: Stablizing Shape Consistency in Video-to-Video Editing".
Treeffuser is an easy-to-use package for probabilistic prediction and probabilistic regression on tabular data with tree-based diffusion models.
Identity-Preserving Text-to-Video Generation by Frequency Decomposition
Official implementation of the paper “MagicDriveDiT: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control”
The official implementation of the paper titled "StableV2V: Stablizing Shape Consistency in Video-to-Video Editing".
Baking Gaussian Splatting into Diffusion Denoiser for Fast and Scalable Single-stage Image-to-3D Generation
Official code of "Imagine360: Immersive 360 Video Generation from Perspective Anchor"
Official implementation of "Art-Free Generative Models: Art Creation Without Graphic Art Knowledge"
HunyuanVideo: A Systematic Framework For Large Video Generation Model
[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simp...
Implementation of papers in 100 lines of code.
Identity-Preserving Text-to-Video Generation by Frequency Decomposition
Code of Pyramidal Flow Matching for Efficient Video Generative Modeling
SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.
OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
Official implementation of the paper “MagicDriveDiT: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control”
A collection of resources and papers on Diffusion Models
Material for lectures on Diffusion models at IE university
《Pytorch实用教程》(第二版)无论是零基础入门,还是CV、NLP、LLM项目应用,或是进阶工程化部署落地,在这里都有。相信在本书的帮助下,读者将能够轻松掌握 PyTorch 的使用,成为一名优秀的深度学习工程师。
High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
The official implementation of the paper titled "StableV2V: Stablizing Shape Consistency in Video-to-Video Editing".
CatVTON is a simple and efficient virtual try-on diffusion model with 1) Lightweight Network (899.06M parameters totally), 2) Parameter-Efficient Training (49.57M parameters trainable) and 3) Simplifi...
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model. NeurIPS 2024 Spotlight.
Baking Gaussian Splatting into Diffusion Denoiser for Fast and Scalable Single-stage Image-to-3D Generation
Official code of "Imagine360: Immersive 360 Video Generation from Perspective Anchor"
Official implementation of the paper “MagicDriveDiT: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control”
Inference-only implementation of "One-Step Diffusion Distillation through Score Implicit Matching" [NIPS 2024]
HunyuanVideo: A Systematic Framework For Large Video Generation Model
XQ-GAN🚀: An Open-source Image Tokenization Framework for Autoregressive Generation
Repository for the paper "Combining audio control and style transfer using latent diffusion", accepted at ISMIR 2024
A repository for organizing papers, codes and other resources related to Virtual Try-on Models
SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
Implementation of papers in 100 lines of code.
A diffusion model-based stereo depth estimation framework that can predict state-of-the-art depth and restore noisy depth maps for transparent and specular surfaces
ICML 2024, Official Implementation of "Cross-view Masked Diffusion Transformers for Person Image Synthesis."
📚 Collection of awesome generation acceleration resources.
Pytorch Implementation of "SMITE: Segment Me In TimE"
"Robust Watermarking Using Generative Priors Against Image Editing: From Benchmarking to Advances" (Official Implementation)
Treeffuser is an easy-to-use package for probabilistic prediction and probabilistic regression on tabular data with tree-based diffusion models.
[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simp...
The official implementation of DiffAbXL benchmarked in the paper "Exploring Log-Likelihood Scores for Ranking Antibody Sequence Designs", formerly titled "Benchmarking Generative Models for Antibody D...
[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simp...
SUPIR aims at developing Practical Algorithms for Photo-Realistic Image Restoration In the Wild. Our new online demo is also released at suppixel.ai.
Code of Pyramidal Flow Matching for Efficient Video Generative Modeling
Lumina-T2X is a unified framework for Text to Any Modality Generation
High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model. NeurIPS 2024 Spotlight.
[ECCV 2024] The official implementation of paper "BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion"
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
Official implementation of "MIMO: Controllable Character Video Synthesis with Spatial Decomposed Modeling"
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
Collecting awesome papers of RAG for AIGC. We propose a taxonomy of RAG foundations, enhancements, and applications in paper "Retrieval-Augmented Generation for AI-Generated Content: A Survey".
Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.
[AAAI 2025]👔IMAGDressing👔: Interactive Modular Apparel Generation for Virtual Dressing. It enables customizable human image generation with flexible garment, pose, and scene control, ensuring high f...
A reading list for large models safety, security, and privacy (including Awesome LLM Security, Safety, etc.).
CatVTON is a simple and efficient virtual try-on diffusion model with 1) Lightweight Network (899.06M parameters totally), 2) Parameter-Efficient Training (49.57M parameters trainable) and 3) Simplifi...
[CVPR 2024] PIA, your Personalized Image Animator. Animate your images by text prompt, combing with Dreambooth, achieving stunning videos. PIA,你的个性化图像动画生成器,利用文本提示将图像变为奇妙的动画
[ECCV 2024] MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model.
[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simp...
SUPIR aims at developing Practical Algorithms for Photo-Realistic Image Restoration In the Wild. Our new online demo is also released at suppixel.ai.
HunyuanVideo: A Systematic Framework For Large Video Generation Model
Official repo for VGen: a holistic video generation ecosystem for video generation building on diffusion models
《Pytorch实用教程》(第二版)无论是零基础入门,还是CV、NLP、LLM项目应用,或是进阶工程化部署落地,在这里都有。相信在本书的帮助下,读者将能够轻松掌握 PyTorch 的使用,成为一名优秀的深度学习工程师。
OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.
A collection of resources and papers on Diffusion Models
Code of Pyramidal Flow Matching for Efficient Video Generative Modeling
Lumina-T2X is a unified framework for Text to Any Modality Generation
High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
A general fine-tuning kit geared toward diffusion models.
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model. NeurIPS 2024 Spotlight.
[ECCV 2024] The official implementation of paper "BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion"
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
Official implementation of "MIMO: Controllable Character Video Synthesis with Spatial Decomposed Modeling"
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
Official implementation of Posterior-Mean Rectified Flow: Towards Minimum MSE Photo-Realistic Image Restoration
SUPIR aims at developing Practical Algorithms for Photo-Realistic Image Restoration In the Wild. Our new online demo is also released at suppixel.ai.
[ECCV 2024] Single Image to 3D Textured Mesh in 10 seconds with Convolutional Reconstruction Model.
A curated list of 3D Vision papers relating to Robotics domain in the era of large models i.e. LLMs/VLMs, inspired by awesome-computer-vision, including papers, codes, and related websites
A reading list for large models safety, security, and privacy (including Awesome LLM Security, Safety, etc.).
Collecting awesome papers of RAG for AIGC. We propose a taxonomy of RAG foundations, enhancements, and applications in paper "Retrieval-Augmented Generation for AI-Generated Content: A Survey".
High-quality Text-to-Audio Generation with Efficient Diffusion Transformer
PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation (ECCV 2024)
[ECCV 2024] MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model.
Official code repository of CBGBench: Fill in the Blank of Protein-Molecule Complex Binding Graph
Text to Image Latent Diffusion using a Transformer core
Official repo for VGGHeads: 3D Multi Head Alignment with a Large-Scale Synthetic Dataset..
[CVPR 2024] Official code for "Text-Driven Image Editing via Learnable Regions"
ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation (TMLR 2024)
🔥 [CVPR2024] Official implementation of "Self-correcting LLM-controlled Diffusion Models (SLD)
An open-source toolbox for fast sampling of diffusion models. Official implementations of our works published in ICML, NeurIPS, CVPR.
[arXiv'24] VistaDream: Sampling multiview consistent images for single-view scene reconstruction
Live2Diff: A Pipeline that processes Live video streams by a uni-directional video Diffusion model.
Official Code Release for [SIGGRAPH 2024] DiLightNet: Fine-grained Lighting Control for Diffusion-based Image Generation