219 results found Sort:
- Filter by Primary Language:
- Python (142)
- Jupyter Notebook (39)
- C++ (5)
- JavaScript (4)
- Kotlin (3)
- C (2)
- Lua (2)
- Rust (2)
- Swift (2)
- TypeScript (2)
- Markdown (1)
- CSS (1)
- Vue (1)
- Java (1)
- HTML (1)
- Go (1)
- Svelte (1)
- +
BoxMOT: pluggable SOTA tracking modules for segmentation, object detection and pose estimation models
Created
2020-06-26
3,403 commits to master branch, last one a day ago
Effortless data labeling with AI support from Segment Anything and other awesome models.
Created
2023-05-23
817 commits to main branch, last one 5 hours ago
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
Created
2022-07-08
382 commits to master branch, last one 7 months ago
Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai
Created
2022-08-01
1,532 commits to mainline branch, last one 3 days ago
开放源码的无App推送服务,iOS14+扫码即用。亦支持快应用/iOS和Mac客户端、Android客户端、自制设备
Created
2021-12-16
310 commits to main branch, last one 17 days ago
OpenMMLab Pre-training Toolbox and Benchmark
Created
2020-07-09
974 commits to main branch, last one 5 months ago
中文nlp解决方案(大模型、数据、模型、训练、推理)
Created
2023-02-05
249 commits to main branch, last one about a month ago
Image to prompt with BLIP and CLIP
Created
2022-08-09
98 commits to main branch, last one about a year ago
Collection of AWESOME vision-language models for vision tasks
Created
2023-03-30
91 commits to main branch, last one 7 days ago
Easily compute clip embeddings and build a clip retrieval system with them
Created
2021-06-07
332 commits to main branch, last one about a year ago
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
Created
2023-12-01
1,253 commits to main branch, last one a day ago
Android UI 快速开发,专治原生控件各种不服
Created
2018-04-26
186 commits to master branch, last one about a year ago
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
Created
2024-06-17
59 commits to main branch, last one 5 months ago
Must-have resource for anyone who wants to experiment with and build on the OpenAI vision API 🔥
Created
2023-11-07
52 commits to main branch, last one 4 months ago
🥂 Gracefully face hCaptcha challenge with multimodal large language model.
Created
2022-02-15
886 commits to main branch, last one 15 hours ago
[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for ...
Created
2023-05-18
44 commits to main branch, last one a day ago
Awesome list for research on CLIP (Contrastive Language-Image Pre-Training).
Created
2021-09-05
56 commits to main branch, last one 2 years ago
Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️
Created
2023-02-21
298 commits to main branch, last one 2 months ago
This series will take you on a journey from the fundamentals of NLP and Computer Vision to the cutting edge of Vision-Language Models.
Created
2024-12-20
6 commits to master branch, last one 2 months ago
Stable Diffusion in NCNN with c++, supported txt2img and img2img
Created
2022-11-11
60 commits to main branch, last one about a year ago
Search photos on Unsplash using natural language
Created
2021-01-16
65 commits to main branch, last one 2 years ago
An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
Created
2021-04-13
29 commits to master branch, last one 2 years ago
Search inside YouTube videos using natural language
Created
2021-02-01
20 commits to main branch, last one 3 years ago
Official Pytorch Implementation for "Text2LIVE: Text-Driven Layered Image and Video Editing" (ECCV 2022 Oral)
Created
2022-07-26
9 commits to main branch, last one 2 years ago
[ICCV 2021- Oral] Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-bas...
Created
2021-03-23
77 commits to main branch, last one 2 years ago
CLIP + FFT/DWT/RGB = text to image/video
Created
2021-02-28
180 commits to master branch, last one about a month ago
Famous Vision Language Models and Their Architectures
Created
2024-02-15
240 commits to main branch, last one about a month ago
[CVPR'23] OpenScene: 3D Scene Understanding with Open Vocabularies
Created
2023-03-18
15 commits to main branch, last one about a year ago
基于Stable Diffusion优化的AI绘画模型。支持输入中英文文本,可生成多种现代艺术风格的高质量图像。| An optimized text-to-image model based on Stable Diffusion. Both Chinese and English text inputs are available to generate images. The model c...
Created
2022-12-13
36 commits to main branch, last one 2 years ago
Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm
Created
2021-10-09
34 commits to main branch, last one 2 years ago