9 results found Sort:

Unofficial implementation of "Prompt-to-Prompt Image Editing with Cross Attention Control" with Stable Diffusion
Created 2022-09-09
49 commits to main branch, last one about a year ago
55
921
apache-2.0
13
Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️
Created 2023-02-21
286 commits to main branch, last one about a month ago
[TPAMI'23] Unifying Flow, Stereo and Depth Estimation
Created 2022-11-04
38 commits to master branch, last one about a month ago
T-GATE: Temporally Gating Attention to Accelerate Diffusion Model for Free!
Created 2024-03-28
105 commits to main branch, last one 9 days ago
Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmind
Created 2024-01-09
67 commits to main branch, last one 3 months ago
1-shot image segmentation using Stable Diffusion
Created 2023-09-27
22 commits to main branch, last one 2 months ago
Code on selecting an action based on multimodal inputs. Here in this case inputs are voice and text.
Created 2021-05-19
14 commits to main branch, last one 2 years ago
The official repository of "Energy-Based Cross Attention for Bayesian Context Update in Text-to-Image Diffusion Models".
Created 2023-05-24
7 commits to main branch, last one 2 months ago