22 results found Sort:

249
3.8k
other
34
🪩 Create Disco Diffusion artworks in one line
Created 2022-06-30
385 commits to main branch, last one about a year ago
233
3.0k
apache-2.0
46
Represent, send, store and search multimodal data
Created 2021-12-14
1,462 commits to main branch, last one about a month ago
A curated list of different papers and datasets in various areas of audio-visual processing
Created 2019-03-30
63 commits to master branch, last one 9 months ago
113
547
apache-2.0
10
PyTorch source code for "Stacked Cross Attention for Image-Text Matching" (ECCV 2018)
Created 2018-05-11
19 commits to master branch, last one about a year ago
112
454
apache-2.0
7
Analyze the unstructured data with Towhee, such as reverse image search, reverse video search, audio classification, question and answer systems, molecular search, etc.
Created 2022-04-11
401 commits to main branch, last one about a year ago
13
203
unknown
23
[CVPR 2023] Referring Image Matting
Created 2022-06-12
6 commits to master branch, last one about a year ago
5
194
unknown
3
The official implementation of Achieving Cross Modal Generalization with Multimodal Unified Representation (NeurIPS '23)
Created 2023-10-24
27 commits to master branch, last one about a month ago
25
178
unknown
2
Remote Sensing Sar-Optical Land-use Classfication Pytorch Pytorch高分辨率遥感语义分割/地物分割/地物分类
Created 2022-06-01
86 commits to main branch, last one 6 months ago
10
121
unknown
4
[NAACL 2022]Mobile Text-to-Image search powered by multimodal semantic representation models(e.g., OpenAI's CLIP)
Created 2021-08-07
132 commits to main branch, last one about a year ago
Weakly Supervised 3D Object Detection from Point Clouds (VS3D), ACM MM 2020
Created 2020-07-28
10 commits to master branch, last one 4 years ago
BioT5 (EMNLP 2023) and BioT5+ (ACL 2024 Findings)
Created 2023-10-11
60 commits to main branch, last one 2 months ago
8
91
unknown
2
Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning, CVPR 2022
Created 2022-04-29
5 commits to master branch, last one about a year ago
DistillBEV: Boosting Multi-Camera 3D Object Detection with Cross-Modal Knowledge Distillation (ICCV 2023)
Created 2023-09-25
19 commits to main branch, last one 12 months ago
12
71
apache-2.0
4
Code for journal paper "Learning Dual Semantic Relations with Graph Attention for Image-Text Matching", TCSVT, 2020.
Created 2020-10-22
60 commits to main branch, last one 2 years ago
This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have been cited and discussed in the survey just accepted https://dl....
Created 2022-03-13
66 commits to main branch, last one about a year ago
1
64
apache-2.0
1
[CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"
Created 2023-08-28
17 commits to main branch, last one about a month ago
10
60
unknown
2
Official PyTorch implementation of our CVPR 2022 paper: Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual Context for Image Captioning
Created 2022-05-09
3 commits to main branch, last one 2 years ago
7
56
apache-2.0
6
Unleash the Potential of Image Branch for Cross-modal 3D Object Detection [NeurIPS2023]
Created 2023-01-20
13 commits to main branch, last one 5 months ago
8
49
mit
4
[Paper][AAAI 2023] DUET: Cross-modal Semantic Grounding for Contrastive Zero-shot Learning
Created 2022-11-27
48 commits to main branch, last one 9 months ago
[ECCV2022] Contrastive Vision-Language Pre-training with Limited Resources
Created 2022-01-18
10 commits to main branch, last one 2 years ago
Code, dataset and models for our CVPR 2022 publication "Text2Pos"
Created 2021-02-09
202 commits to master branch, last one 2 years ago