10 results found Sort:

974
9.9k
bsd-3-clause
98
LAVIS - A One-stop Library for Language-Vision Intelligence
Created 2022-08-24
492 commits to main branch, last one 2 days ago
688
6.8k
apache-2.0
42
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
Created 2023-03-09
84 commits to main branch, last one 3 months ago
642
4.8k
bsd-3-clause
34
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Created 2022-01-25
64 commits to main branch, last one 2 years ago
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
Created 2022-09-28
62 commits to main branch, last one about a month ago
[CVPR2023 Highlight] GRES: Generalized Referring Expression Segmentation
Created 2023-03-11
14 commits to main branch, last one about a year ago
29
489
apache-2.0
8
[CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perception
Created 2023-08-25
57 commits to main branch, last one 6 months ago
[ICCV2021 & TPAMI2023] Vision-Language Transformer and Query Generation for Referring Segmentation
Created 2021-07-23
7 commits to main branch, last one 3 years ago
5
99
bsd-3-clause
5
[ICML 2023] UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers.
Created 2023-05-27
146 commits to main branch, last one about a year ago
Instruction Following Agents with Multimodal Transforemrs
Created 2022-10-23
5 commits to main branch, last one 2 years ago
0
26
unknown
6
[ICML 2024] CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers.
Created 2023-05-27
11 commits to main branch, last one about a year ago