Trending repositories for topic diffusion-models
HunyuanVideo: A Systematic Framework For Large Video Generation Model
[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simp...
Code for FreeScale, a tuning-free method for higher-resolution visual generation
Colab Notebooks covering deep learning tools for biomolecular structure prediction and design
Official implementation of the paper: "FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models"
Official implementation of the paper “MagicDriveDiT: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control”
Official Implementation for "InstantRestore: Single-Step Personalized Face Restoration with Shared-Image Attention"
Implementation of papers in 100 lines of code.
Identity-Preserving Text-to-Video Generation by Frequency Decomposition
A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.
OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
《Pytorch实用教程》(第二版)无论是零基础入门,还是CV、NLP、LLM项目应用,或是进阶工程化部署落地,在这里都有。相信在本书的帮助下,读者将能够轻松掌握 PyTorch 的使用,成为一名优秀的深度学习工程师。
High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
Code of Pyramidal Flow Matching for Efficient Video Generative Modeling
OpenMMLab Multimodal Advanced, Generative, and Intelligent Creation Toolbox. Unlock the magic 🪄: Generative-AI (AIGC), easy-to-use APIs, awsome model zoo, diffusion models, for text-to-image generati...
[NeurIPS'24] Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy
Official repo for VGGHeads: 3D Multi Head Alignment with a Large-Scale Synthetic Dataset..
Code for FreeScale, a tuning-free method for higher-resolution visual generation
Official Implementation for "InstantRestore: Single-Step Personalized Face Restoration with Shared-Image Attention"
Official implementation of the paper: "FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models"
Colab Notebooks covering deep learning tools for biomolecular structure prediction and design
[NeurIPS'24] Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy
Official implementation of the paper “MagicDriveDiT: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control”
A repository for organizing papers, codes and other resources related to Virtual Try-on Models
The code of our work "Golden Noise for Diffusion Models: A Learning Framework".
A diffusion model-based stereo depth estimation framework that can predict state-of-the-art depth and restore noisy depth maps for transparent and specular surfaces
Official repo for VGGHeads: 3D Multi Head Alignment with a Large-Scale Synthetic Dataset..
HunyuanVideo: A Systematic Framework For Large Video Generation Model
Identity-Preserving Text-to-Video Generation by Frequency Decomposition
The collection of awesome papers on alignment of diffusion models.
📚 Collection of awesome generation acceleration resources.
This repository implements time series diffusion in the frequency domain.
[ECCV 2024] GaussCtrl: Multi-View Consistent Text-Driven 3D Gaussian Splatting Editing
Code for FreeScale, a tuning-free method for higher-resolution visual generation
HunyuanVideo: A Systematic Framework For Large Video Generation Model
[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simp...
Colab Notebooks covering deep learning tools for biomolecular structure prediction and design
Official implementation of the paper: "FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models"
Implementation of papers in 100 lines of code.
Official Implementation for "InstantRestore: Single-Step Personalized Face Restoration with Shared-Image Attention"
Code for FreeScale, a tuning-free method for higher-resolution visual generation
Official implementation of the paper “MagicDriveDiT: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control”
[ICLR24] Official implementation of the paper “MagicDrive: Street View Generation with Diverse 3D Geometry Control”
Identity-Preserving Text-to-Video Generation by Frequency Decomposition
OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
Code of Pyramidal Flow Matching for Efficient Video Generative Modeling
A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.
《Pytorch实用教程》(第二版)无论是零基础入门,还是CV、NLP、LLM项目应用,或是进阶工程化部署落地,在这里都有。相信在本书的帮助下,读者将能够轻松掌握 PyTorch 的使用,成为一名优秀的深度学习工程师。
A collection of resources and papers on Diffusion Models
High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
Colab Notebooks covering deep learning tools for biomolecular structure prediction and design
Official Implementation for "InstantRestore: Single-Step Personalized Face Restoration with Shared-Image Attention"
Code for FreeScale, a tuning-free method for higher-resolution visual generation
Official implementation of the paper: "FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models"
[NeurIPS'24] Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy
Official code of "Imagine360: Immersive 360 Video Generation from Perspective Anchor"
[NeurIPS2024] DiffPhyCon uses generative models to control complex physical systems
Official implementation of the paper “MagicDriveDiT: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control”
The code of our work "Golden Noise for Diffusion Models: A Learning Framework".
A diffusion model-based stereo depth estimation framework that can predict state-of-the-art depth and restore noisy depth maps for transparent and specular surfaces
HunyuanVideo: A Systematic Framework For Large Video Generation Model
A repository for organizing papers, codes and other resources related to Virtual Try-on Models
[ICRA 2024] Language-Conditioned Affordance-Pose Detection in 3D Point Clouds
Identity-Preserving Text-to-Video Generation by Frequency Decomposition
Implementation of papers in 100 lines of code.
SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
XQ-GAN🚀: An Open-source Image Tokenization Framework for Autoregressive Generation
Official implementation of the paper “MagicDriveDiT: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control”
Official implementation of the paper: "FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models"
The official implementation of the paper titled "StableV2V: Stablizing Shape Consistency in Video-to-Video Editing".
Baking Gaussian Splatting into Diffusion Denoiser for Fast and Scalable Single-stage Image-to-3D Generation
Official Implementation for "InstantRestore: Single-Step Personalized Face Restoration with Shared-Image Attention"
Official code of "Imagine360: Immersive 360 Video Generation from Perspective Anchor"
Code for FreeScale, a tuning-free method for higher-resolution visual generation
The code of our work "Golden Noise for Diffusion Models: A Learning Framework".
Official implementation of "Art-Free Generative Models: Art Creation Without Graphic Art Knowledge"
HunyuanVideo: A Systematic Framework For Large Video Generation Model
[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simp...
Implementation of papers in 100 lines of code.
Identity-Preserving Text-to-Video Generation by Frequency Decomposition
Code of Pyramidal Flow Matching for Efficient Video Generative Modeling
SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
Official implementation of the paper “MagicDriveDiT: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control”
A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.
Colab Notebooks covering deep learning tools for biomolecular structure prediction and design
OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
A collection of resources and papers on Diffusion Models
Material for lectures on Diffusion models at IE university
《Pytorch实用教程》(第二版)无论是零基础入门,还是CV、NLP、LLM项目应用,或是进阶工程化部署落地,在这里都有。相信在本书的帮助下,读者将能够轻松掌握 PyTorch 的使用,成为一名优秀的深度学习工程师。
Official implementation of the paper: "FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models"
The official implementation of the paper titled "StableV2V: Stablizing Shape Consistency in Video-to-Video Editing".
High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
XQ-GAN🚀: An Open-source Image Tokenization Framework for Autoregressive Generation
Official code of "Imagine360: Immersive 360 Video Generation from Perspective Anchor"
Official Implementation for "InstantRestore: Single-Step Personalized Face Restoration with Shared-Image Attention"
Code for FreeScale, a tuning-free method for higher-resolution visual generation
Official implementation of the paper “MagicDriveDiT: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control”
HunyuanVideo: A Systematic Framework For Large Video Generation Model
Inference-only implementation of "One-Step Diffusion Distillation through Score Implicit Matching" [NIPS 2024]
Official implementation of the paper: "FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models"
[3DV 2025] GarmentDreamer: 3DGS Guided Garment Synthesis with Diverse Geometry and Texture Details
XQ-GAN🚀: An Open-source Image Tokenization Framework for Autoregressive Generation
A repository for organizing papers, codes and other resources related to Virtual Try-on Models
Implementation of papers in 100 lines of code.
SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
ICML 2024, Official Implementation of "Cross-view Masked Diffusion Transformers for Person Image Synthesis."
[NeurIPS2024] DiffPhyCon uses generative models to control complex physical systems
[NeurIPS'24] Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy
📚 Collection of awesome generation acceleration resources.
A diffusion model-based stereo depth estimation framework that can predict state-of-the-art depth and restore noisy depth maps for transparent and specular surfaces
[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simp...
SUPIR aims at developing Practical Algorithms for Photo-Realistic Image Restoration In the Wild. Our new online demo is also released at suppixel.ai.
Code of Pyramidal Flow Matching for Efficient Video Generative Modeling
Lumina-T2X is a unified framework for Text to Any Modality Generation
High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model. NeurIPS 2024 Spotlight.
[ECCV 2024] The official implementation of paper "BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion"
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
Official implementation of "MIMO: Controllable Character Video Synthesis with Spatial Decomposed Modeling"
Collecting awesome papers of RAG for AIGC. We propose a taxonomy of RAG foundations, enhancements, and applications in paper "Retrieval-Augmented Generation for AI-Generated Content: A Survey".
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.
[AAAI 2025]👔IMAGDressing👔: Interactive Modular Apparel Generation for Virtual Dressing. It enables customizable human image generation with flexible garment, pose, and scene control, ensuring high f...
A reading list for large models safety, security, and privacy (including Awesome LLM Security, Safety, etc.).
CatVTON is a simple and efficient virtual try-on diffusion model with 1) Lightweight Network (899.06M parameters totally), 2) Parameter-Efficient Training (49.57M parameters trainable) and 3) Simplifi...
[CVPR 2024] PIA, your Personalized Image Animator. Animate your images by text prompt, combing with Dreambooth, achieving stunning videos. PIA,你的个性化图像动画生成器,利用文本提示将图像变为奇妙的动画
[ECCV 2024] MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model.
[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simp...
HunyuanVideo: A Systematic Framework For Large Video Generation Model
SUPIR aims at developing Practical Algorithms for Photo-Realistic Image Restoration In the Wild. Our new online demo is also released at suppixel.ai.
《Pytorch实用教程》(第二版)无论是零基础入门,还是CV、NLP、LLM项目应用,或是进阶工程化部署落地,在这里都有。相信在本书的帮助下,读者将能够轻松掌握 PyTorch 的使用,成为一名优秀的深度学习工程师。
OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.
Official repo for VGen: a holistic video generation ecosystem for video generation building on diffusion models
Code of Pyramidal Flow Matching for Efficient Video Generative Modeling
A collection of resources and papers on Diffusion Models
Lumina-T2X is a unified framework for Text to Any Modality Generation
High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
A general fine-tuning kit geared toward diffusion models.
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model. NeurIPS 2024 Spotlight.
[ECCV 2024] The official implementation of paper "BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion"
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
Official implementation of "MIMO: Controllable Character Video Synthesis with Spatial Decomposed Modeling"
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
Official implementation of Posterior-Mean Rectified Flow: Towards Minimum MSE Photo-Realistic Image Restoration
SUPIR aims at developing Practical Algorithms for Photo-Realistic Image Restoration In the Wild. Our new online demo is also released at suppixel.ai.
[ECCV 2024] Single Image to 3D Textured Mesh in 10 seconds with Convolutional Reconstruction Model.
A reading list for large models safety, security, and privacy (including Awesome LLM Security, Safety, etc.).
A curated list of 3D Vision papers relating to Robotics domain in the era of large models i.e. LLMs/VLMs, inspired by awesome-computer-vision, including papers, codes, and related websites
Collecting awesome papers of RAG for AIGC. We propose a taxonomy of RAG foundations, enhancements, and applications in paper "Retrieval-Augmented Generation for AI-Generated Content: A Survey".
High-quality Text-to-Audio Generation with Efficient Diffusion Transformer
PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation (ECCV 2024)
[ECCV 2024] MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model.
Official code repository of CBGBench: Fill in the Blank of Protein-Molecule Complex Binding Graph
Official repo for VGGHeads: 3D Multi Head Alignment with a Large-Scale Synthetic Dataset..
Text to Image Latent Diffusion using a Transformer core
[CVPR 2024] Official code for "Text-Driven Image Editing via Learnable Regions"
ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation (TMLR 2024)
An open-source toolbox for fast sampling of diffusion models. Official implementations of our works published in ICML, NeurIPS, CVPR.
[arXiv'24] VistaDream: Sampling multiview consistent images for single-view scene reconstruction
Live2Diff: A Pipeline that processes Live video streams by a uni-directional video Diffusion model.
Official Code Release for [SIGGRAPH 2024] DiLightNet: Fine-grained Lighting Control for Diffusion-based Image Generation
High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance