23 results found Sort:

246
3.8k
other
34
🪩 Create Disco Diffusion artworks in one line
Created 2022-06-30
385 commits to main branch, last one about a year ago
222
2.8k
apache-2.0
44
Represent, send, store and search multimodal data
Created 2021-12-14
1,448 commits to main branch, last one about a month ago
Effective prompting for Large Multimodal Models like GPT-4 Vision, LLaVA or CogVLM. 🔥
Created 2023-11-24
71 commits to develop branch, last one 4 months ago
A curated list of different papers and datasets in various areas of audio-visual processing
Created 2019-03-30
63 commits to master branch, last one 4 months ago
111
528
apache-2.0
10
PyTorch source code for "Stacked Cross Attention for Image-Text Matching" (ECCV 2018)
Created 2018-05-11
19 commits to master branch, last one about a year ago
104
390
apache-2.0
8
Analyze the unstructured data with Towhee, such as reverse image search, reverse video search, audio classification, question and answer systems, molecular search, etc.
Created 2022-04-11
401 commits to main branch, last one 6 months ago
12
199
unknown
23
[CVPR 2023] Referring Image Matting
Created 2022-06-12
6 commits to master branch, last one about a year ago
20
151
unknown
2
Remote Sensing Sar-Optical Land-use Classfication Pytorch Pytorch高分辨率遥感语义分割/地物分割/地物分类
Created 2022-06-01
86 commits to main branch, last one 25 days ago
10
116
unknown
4
[NAACL 2022]Mobile Text-to-Image search powered by multimodal semantic representation models(e.g., OpenAI's CLIP)
Created 2021-08-07
132 commits to main branch, last one about a year ago
Weakly Supervised 3D Object Detection from Point Clouds (VS3D), ACM MM 2020
Created 2020-07-28
10 commits to master branch, last one 3 years ago
5
86
unknown
2
Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning, CVPR 2022
Created 2022-04-29
5 commits to master branch, last one about a year ago
DistillBEV: Boosting Multi-Camera 3D Object Detection with Cross-Modal Knowledge Distillation (ICCV 2023)
Created 2023-09-25
19 commits to main branch, last one 6 months ago
BioT5: Enriching Cross-modal Integration in Biology with Chemical Knowledge and Natural Language Associations (EMNLP 2023)
Created 2023-10-11
42 commits to main branch, last one 4 days ago
1
70
unknown
3
The official implementation of Achieving Cross Modal Generalization with Multimodal Unified Representation (NeurIPS '23)
Created 2023-10-24
20 commits to master branch, last one 23 days ago
12
68
apache-2.0
4
Code for journal paper "Learning Dual Semantic Relations with Graph Attention for Image-Text Matching", TCSVT, 2020.
Created 2020-10-22
60 commits to main branch, last one 2 years ago
This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have been cited and discussed in the survey just accepted https://dl....
Created 2022-03-13
66 commits to main branch, last one 7 months ago
10
60
unknown
2
Official PyTorch implementation of our CVPR 2022 paper: Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual Context for Image Captioning
Created 2022-05-09
3 commits to main branch, last one about a year ago
0
52
apache-2.0
1
[CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"
Created 2023-08-28
13 commits to main branch, last one about a month ago
7
51
apache-2.0
6
Unleash the Potential of Image Branch for Cross-modal 3D Object Detection [NeurIPS2023]
Created 2023-01-20
12 commits to main branch, last one 4 months ago
[ECCV2022] Contrastive Vision-Language Pre-training with Limited Resources
Created 2022-01-18
10 commits to main branch, last one about a year ago
8
41
mit
4
[Paper][AAAI 2023] DUET: Cross-modal Semantic Grounding for Contrastive Zero-shot Learning
Created 2022-11-27
48 commits to main branch, last one 3 months ago
Code, dataset and models for our CVPR 2022 publication "Text2Pos"
Created 2021-02-09
202 commits to master branch, last one about a year ago