7 results found Sort:

52
864
apache-2.0
13
Official implementation of paper "MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens"
Created 2023-10-02
21 commits to main branch, last one 4 months ago
🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).
Created 2023-11-17
357 commits to main branch, last one 7 days ago
Official implementation of the ICASSP-2022 paper "Text2Poster: Laying Out Stylized Texts on Retrieved Images"
Created 2022-09-18
61 commits to master branch, last one about a year ago
[ICLR 2024] Contextualized Diffusion Models for Text-Guided Image and Video Generation
Created 2024-02-06
18 commits to main branch, last one 10 months ago
HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation
Created 2025-02-11
43 commits to main branch, last one about a month ago
A curated list of vision-to-music generation: methods, datasets, evaluation and challenges.
Created 2025-03-24
11 commits to main branch, last one 15 days ago
[CVPR '23] Unite and Conquer: Plug & Play Multi-Modal Synthesis using Diffusion Models
Created 2023-04-20
47 commits to main branch, last one about a year ago