19 results found Sort:
- Filter by Primary Language:
- Python (13)
- Kotlin (2)
- C++ (1)
- Jupyter Notebook (1)
- +
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
Created
2023-11-22
216 commits to main branch, last one a day ago
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Created
2022-01-25
64 commits to main branch, last one 2 years ago
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
Created
2022-07-08
382 commits to master branch, last one 3 months ago
The Paper List of Large Multi-Modality Model, Parameter-Efficient Finetuning, Vision-Language Pretraining, Conventional Image-Text Matching for Preliminary Insight.
tutorial
awesome-list
image-text-matching
large-vision-models
vision-and-language
image-text-retrieval
video-text-retrieval
cross-modal-retrieval
large-language-models
multimodal-pretraining
video-text-recognition
memory-efficient-tuning
visual-semantic-embedding
large-vision-language-models
parameter-efficient-fine-tuning
multimodal-large-language-models
Created
2020-12-22
129 commits to main branch, last one 4 months ago
🔍 Search local images with natural language on Android, powered by OpenAI's CLIP model. / 在 Android 上用自然语言搜索本地图片 (基于 OpenAI 的 CLIP 模型)
Created
2023-08-10
110 commits to master branch, last one 28 days ago
Offline semantic Text-to-Image and Image-to-Image search on Android powered by quantized state-of-the-art vision-language pretrained CLIP model and ONNX Runtime inference engine
Created
2023-02-24
43 commits to main branch, last one about a year ago
[AAAI2021] The code of “Similarity Reasoning and Filtration for Image-Text Matching”
Created
2020-12-16
45 commits to main branch, last one 7 months ago
Official implementation of the ICASSP-2022 paper "Text2Poster: Laying Out Stylized Texts on Retrieved Images"
Created
2022-09-18
61 commits to master branch, last one 11 months ago
Research Code for Multimodal-Cognition Team in Ant Group
Created
2023-08-21
142 commits to main branch, last one 4 months ago
PyTorch code for BagFormer: Better Cross-Modal Retrieval via bag-wise interaction
Created
2022-05-24
35 commits to main branch, last one about a year ago
[ICML 2023] UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers.
Created
2023-05-27
146 commits to main branch, last one about a year ago
mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections. (EMNLP 2022)
Created
2023-05-08
4 commits to main branch, last one about a year ago
ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration
Created
2021-08-02
59 commits to main branch, last one about a year ago
使用OpenCV+onnxruntime部署中文clip做以文搜图,给出一句话来描述想要的图片,就能从图库中搜出来符合要求的图片。包含C++和Python两个版本的程序
Created
2023-12-24
17 commits to main branch, last one 10 months ago
Image captioning using python and BLIP
Created
2023-01-13
32 commits to master branch, last one about a year ago
Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"
Created
2023-11-10
13 commits to main branch, last one 7 months ago
Official implementation of our EMNLP 2022 paper "CPL: Counterfactual Prompt Learning for Vision and Language Models"
Created
2022-10-26
16 commits to master branch, last one about a year ago
[TIP2023] The code of “Plug-and-Play Regulators for Image-Text Matching”
Created
2023-03-23
15 commits to main branch, last one 7 months ago
[ICML 2024] CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers.
Created
2023-05-27
11 commits to main branch, last one about a year ago