4 results found Sort:

63
1.1k
apache-2.0
15
Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️
Created 2023-02-21
293 commits to main branch, last one 2 months ago
Reproducible scaling laws for contrastive language-image learning (https://arxiv.org/abs/2212.07143)
Created 2022-12-13
54 commits to master branch, last one about a year ago
Using Segment-Anything and CLIP to generate pixel-aligned semantic features.
Created 2023-04-20
2 commits to main branch, last one about a year ago
[Official] [IROS 2024] A goal-oriented planning to lift VLN performance for Closed-Loop Navigation: Simple, Yet Effective
Created 2024-03-22
7 commits to main branch, last one 8 months ago