Search Results - RepositoryStats

2.1k

12.6k

other

220

🏄 Scalable embedding, reasoning, ranking for images and sentences with CLIP

bert onnx openai pytorch image2vec clip-model sentence2vec deep-learning neural-search cross-modality multi-modality bert-as-service clip-as-service sentence-encoding cross-modal-retrieval

Created 2018-11-12

1,960 commits to main branch, last one about a year ago

xmodaler YehLi

105

970

other

28

X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense r...

tden pretraining image-captioning video-captioning vision-and-language cross-modal-retrieval visual-question-answering

Created 2021-06-25

84 commits to master branch, last one 2 years ago

Awesome_Matching_Pretraining_Transfering Paranioar

48

422

mit

13

The Paper List of Large Multi-Modality Model (Perception, Generation, Unification), Parameter-Efficient Finetuning, Vision-Language Pretraining, Conventional Image-Text Matching for Preliminary Insigh...

tutorial awesome-list image-text-matching large-vision-models vision-and-language image-text-retrieval large-language-model video-text-retrieval cross-modal-retrieval large-language-models multimodal-pretraining video-text-recognition memory-efficient-tuning text-to-image-synthesis text-to-image-generation text-to-video-generation visual-semantic-embedding large-vision-language-models parameter-efficient-fine-tuning multimodal-large-language-models

Created 2020-12-22

130 commits to main branch, last one 3 months ago

tidy slavabarkov

28

408

gpl-3.0

7

Offline semantic Text-to-Image and Image-to-Image search on Android powered by quantized state-of-the-art vision-language pretrained CLIP model and ONNX Runtime inference engine

nlp clip onnx kotlin android image-search quantization deep-learning computer-vision image-retrieval semantic-search image-text-matching image-text-retrieval cross-modal-retrieval

Created 2023-02-24

43 commits to main branch, last one about a year ago

KG-MM-Survey zjukg

19

395

mit

8

Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey

awsome survey surveys paper-list awsome-list entity-linking knowledge-graph entity-alignment image-generation multi-modal-fusion image-classification multi-modal-learning cross-modal-retrieval large-language-models information-extraction visual-question-answering knowledge-graph-embeddings multi-modal-knowledge-graph

Created 2024-01-29

83 commits to main branch, last one 3 months ago

Image-Text-Embedding layumi

73

289

mit

11

TOMM2020 Dual-Path Convolutional Image-Text Embedding with Instance Loss :feet: https://arxiv.org/abs/1711.05535

matlab matconvnet image-search cross-modality image-retrieval visual-semantic language-retrieval cross-modal-retrieval bidirectional-retrieval person-reidentification

Created 2017-11-17

147 commits to master branch, last one 2 months ago

SGRAF Paranioar

36

213

unknown

5

[AAAI2021] The code of “Similarity Reasoning and Filtration for Image-Text Matching”

aaai text-matching image-retrieval similarity-metric image-text-matching image-text-retrieval cross-modal-retrieval

Created 2020-12-16

45 commits to main branch, last one 11 months ago

vse_infty woodfrog

16

160

mit

3

Code for "Learning the Best Pooling Strategy for Visual Semantic Embedding", CVPR 2021 (Oral)

vse pytorch vision-language visual-semantic image-text-matching cross-modal-retrieval

Created 2021-01-10

72 commits to master branch, last one 2 years ago

pvse yalesong

23

134

mit

3

Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval (CVPR 2019)

mrw-dataset tgif-dataset mscoco-dataset metric-learning cross-modal-retrieval

Created 2019-06-11

84 commits to master branch, last one about a year ago

EMCL jpthu17

9

130

mit

2

[NeurIPS 2022 Spotlight] Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations

neurips video-retrieval video-captioning cross-modal-retrieval video-question-answering

Created 2022-09-23

33 commits to main branch, last one 11 months ago

pcme naver-ai

17

129

other

4

Official Pytorch implementation of "Probabilistic Cross-Modal Embedding" (CVPR 2021)

cvpr2021 cross-modal-retrieval probabilistic-embeddings probabilistic-machine-learning

Created 2021-06-15

8 commits to main branch, last one about a year ago

DiffusionRet jpthu17

7

128

apache-2.0

2

[ICCV 2023] DiffusionRet: Generative Text-Video Retrieval with Diffusion Model

iccv2023 video-retrieval diffusion-models cross-modal-retrieval

Created 2023-03-16

20 commits to main branch, last one 11 months ago

HBI jpthu17

5

112

apache-2.0

3

[CVPR 2023 Highlight & TPAMI] Video-Text as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning

cvpr video-retrieval cross-modal-retrieval video-question-answering

Created 2023-02-28

35 commits to main branch, last one 2 months ago

muscall ilaria-manco

11

111

gpl-3.0

7

Official implementation of "Contrastive Audio-Language Learning for Music" (ISMIR 2022)

music-ai cross-modal-retrieval music-information-retrieval

Created 2022-08-16

7 commits to main branch, last one 3 months ago

BagFormer howard-hou

33

97

unknown

23

PyTorch code for BagFormer: Better Cross-Modal Retrieval via bag-wise interaction

vision-language image-text-retrieval cross-modal-retrieval

Created 2022-05-24

35 commits to main branch, last one 2 years ago

pcmepp naver-ai

1

57

other

3

Official Pytorch implementation of "Improved Probabilistic Image-Text Representations" (ICLR 2024)

iclr2024 cross-modal-retrieval probabilistic-embeddings probabilistic-machine-learning

Created 2023-05-29

8 commits to main branch, last one 10 months ago

on-the-fly-FGSBIR AyanKumarBhunia

16

57

unknown

3

[CVPR 2020, Oral] "Sketch Less for More: On-the-Fly Fine-Grained Sketch Based Image Retrieval”, IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2020. .

cvpr sbir sketch fg-sbir cvpr2020 cvpr-oral continuous-rl image-retrieval policy-gradient re-identification cross-modal-retrieval reinforcement-learning pytorch-policy-gradient continuous-reinforcement-learning

Created 2019-12-16

33 commits to master branch, last one 4 years ago

eccv-caption naver-ai

2

56

other

2

Extended COCO Validation (ECCV) Caption dataset (ECCV 2022)

dataset eccv2022 evaluation vl-benchmark deep-learning machine-learning image-text-matching vision-and-language cross-modal-retrieval

Created 2022-03-30

4 commits to main branch, last one about a year ago

UCCH penghu-cs

10

55

unknown

1

Unsupervised Contrastive Cross-modal Hashing (IEEE TPAMI 2023, PyTorch Code)

cross-modal-hashing contrastive-learning cross-modal-retrieval unsupervised-learning

Created 2022-05-20

55 commits to main branch, last one about a year ago

CM2_DVC ailab-kyunghee

2

53

mit

0

[CVPR 2024] Do you remember? Dense Video Captioning with Cross-Modal Memory Retrieval

dvc video memory retrieval video-cap multi-modal cross-modal-retrieval dense-video-captioning

Created 2023-09-06

23 commits to master branch, last one 9 months ago

MRL penghu-cs

9

53

mit

2

Learning Cross-Modal Retrieval with Noisy Labels (CVPR 2021, PyTorch Code)

noisy-labels cross-modal-retrieval multimodal-deep-learning

Created 2021-04-06

103 commits to main branch, last one 3 years ago

DiCoSA jpthu17

2

51

apache-2.0

2

[IJCAI 2023] Text-Video Retrieval with Disentangled Conceptualization and Set-to-Set Alignment

ijcai video-retrieval cross-modal-retrieval

Created 2023-04-29

15 commits to main branch, last one 11 months ago

TextReID BrandonHanx

5

44

unknown

1

[BMVC 2021] Text-Based Person Search with Limited Data

clip transfer-learning cross-modal-retrieval person-reidentification

Created 2021-02-15

14 commits to main branch, last one 2 years ago

Text2Pos-CVPR2022 mako443

6

43

unknown

3

Code, dataset and models for our CVPR 2022 publication "Text2Pos"

nlp cvpr pytorch cvpr2022 cross-modal localization deep-learning computer-vision language-processing cross-modal-learning cross-modal-retrieval

Created 2021-02-09

202 commits to master branch, last one 2 years ago

GNN4CMR LivXue

3

39

mit

1

PyTorch implementation of the AAAI-21 paper "Dual Adversarial Label-aware Graph Neural Networks for Cross-modal Retrieval" and the TPAMI-22 paper "Integrating Multi-Label Contrastive Learning with Dua...

pytorch adversarial-networks contrastive-learning cross-modal-retrieval graph-neural-networks

Created 2021-09-22

18 commits to main branch, last one 2 years ago

DGL knightyxp

1

39

other

1

[AAAI 2024] DGL: Dynamic Global-Local Prompt Tuning for Text-Video Retrieval.

prompt-tuning cross-modal-learning video-text-retrieval cross-modal-retrieval parameter-efficient-tuning video-language-understanding

Created 2024-02-14

28 commits to main branch, last one 5 months ago

RCAR Paranioar

3

33

apache-2.0

1

[TIP2023] The code of “Plug-and-Play Regulators for Image-Text Matching”

tip regulator text-matching image-retrieval image-text-matching image-text-retrieval cross-modal-retrieval

Created 2023-03-23

15 commits to main branch, last one 11 months ago

SPN4CIR BUAADreamer

3

30

mit

1

[ACM MM 2024] Improving Composed Image Retrieval via Contrastive Learning with Scaling Positives and Negatives

blip clip blip2 llama llava acmmm2024 memory-bank transformer data-generation image-retrieval multimodal-learning cross-modal-retrieval multi-modal-retrieval composed-image-retrieval

Created 2024-04-12

12 commits to master branch, last one 4 months ago

SEMScene MartinYuanNJU

1

25

unknown

1

Code implementation of paper "SEMScene: Semantic-Consistency Enhanced Multi-Level Scene Graph Matching for Image-Text Retrieval".

scene-graph-models image-text-matching cross-modal-retrieval

Created 2023-11-26

51 commits to main branch, last one 4 months ago