23 results found Sort:

248
3.8k
other
34
🪩 Create Disco Diffusion artworks in one line
Created 2022-06-30
385 commits to main branch, last one about a year ago
233
3.0k
apache-2.0
45
Represent, send, store and search multimodal data
Created 2021-12-14
1,467 commits to main branch, last one 24 days ago
A curated list of different papers and datasets in various areas of audio-visual processing
Created 2019-03-30
63 commits to master branch, last one about a year ago
115
559
apache-2.0
9
PyTorch source code for "Stacked Cross Attention for Image-Text Matching" (ECCV 2018)
Created 2018-05-11
19 commits to master branch, last one 2 years ago
118
484
apache-2.0
6
Analyze the unstructured data with Towhee, such as reverse image search, reverse video search, audio classification, question and answer systems, molecular search, etc.
Created 2022-04-11
401 commits to main branch, last one about a year ago
6
215
unknown
3
The official implementation of Achieving Cross Modal Generalization with Multimodal Unified Representation (NeurIPS '23)
Created 2023-10-24
29 commits to master branch, last one 3 months ago
27
214
unknown
1
Remote Sensing Sar-Optical Land-use Classfication Pytorch Pytorch高分辨率遥感语义分割/地物分割/地物分类
Created 2022-06-01
86 commits to main branch, last one 11 months ago
13
207
unknown
22
[CVPR 2023] Referring Image Matting
Created 2022-06-12
6 commits to master branch, last one about a year ago
10
124
unknown
4
[NAACL 2022]Mobile Text-to-Image search powered by multimodal semantic representation models(e.g., OpenAI's CLIP)
Created 2021-08-07
132 commits to main branch, last one about a year ago
BioT5 (EMNLP 2023) and BioT5+ (ACL 2024 Findings)
Created 2023-10-11
60 commits to main branch, last one 7 months ago
Weakly Supervised 3D Object Detection from Point Clouds (VS3D), ACM MM 2020
Created 2020-07-28
10 commits to master branch, last one 4 years ago
DistillBEV: Boosting Multi-Camera 3D Object Detection with Cross-Modal Knowledge Distillation (ICCV 2023)
Created 2023-09-25
19 commits to main branch, last one about a year ago
9
95
unknown
2
Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning, CVPR 2022
Created 2022-04-29
5 commits to master branch, last one 2 years ago
12
72
apache-2.0
3
Code for journal paper "Learning Dual Semantic Relations with Graph Attention for Image-Text Matching", TCSVT, 2020.
Created 2020-10-22
60 commits to main branch, last one 3 years ago
This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have been cited and discussed in the survey just accepted https://dl....
Created 2022-03-13
66 commits to main branch, last one about a year ago
1
67
apache-2.0
1
[CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"
Created 2023-08-28
17 commits to main branch, last one 6 months ago
7
61
apache-2.0
5
Unleash the Potential of Image Branch for Cross-modal 3D Object Detection [NeurIPS2023]
Created 2023-01-20
13 commits to main branch, last one 10 months ago
10
60
unknown
2
Official PyTorch implementation of our CVPR 2022 paper: Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual Context for Image Captioning
Created 2022-05-09
3 commits to main branch, last one 2 years ago
8
50
mit
3
[Paper][AAAI 2023] DUET: Cross-modal Semantic Grounding for Contrastive Zero-shot Learning
Created 2022-11-27
48 commits to main branch, last one about a year ago
[ECCV2022] Contrastive Vision-Language Pre-training with Limited Resources
Created 2022-01-18
10 commits to main branch, last one 2 years ago
Code, dataset and models for our CVPR 2022 publication "Text2Pos"
Created 2021-02-09
202 commits to master branch, last one 2 years ago
0
34
unknown
2
AlignCLIP: Improving Cross-Modal Alignment in CLIP (ICLR 2025)
Created 2024-04-09
11 commits to main branch, last one about a month ago