Search Results - RepositoryStats

boxmot mikel-brostrom

1.8k

7.1k

agpl-3.0

61

BoxMOT: pluggable SOTA tracking modules for segmentation, object detection and pose estimation models

Created 2020-06-26

3,403 commits to master branch, last one a day ago

X-AnyLabeling CVHub520

582

5.2k

gpl-3.0

34

Effortless data labeling with AI support from Segment Anything and other awesome models.

Created 2023-05-23

817 commits to main branch, last one 5 hours ago

Chinese-CLIP OFA-Sys

491

5.0k

mit

36

Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.

nlp clip chinese pytorch multi-modal transformers coreml-models deep-learning computer-vision vision-language contrastive-loss pretrained-models image-text-retrieval multi-modal-learning vision-and-language-pre-training

Created 2022-07-08

382 commits to master branch, last one 7 months ago

marqo marqo-ai

201

4.8k

apache-2.0

39

Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai

Created 2022-08-01

1,532 commits to mainline branch, last one 3 days ago

pushdeer easychen

489

4.8k

other

42

开放源码的无App推送服务，iOS14+扫码即用。亦支持快应用/iOS和Mac客户端、Android客户端、自制设备

app clip push notification-service

Created 2021-12-16

310 commits to main branch, last one 17 days ago

mmpretrain open-mmlab

1.1k

3.6k

apache-2.0

29

OpenMMLab Pre-training Toolbox and Benchmark

mae beit clip moco resnet pytorch convnext mobilenet multimodal deep-learning swin-transformer pretrained-models vision-transformer image-classification constrastive-learning masked-image-modeling self-supervised-learning

Created 2020-07-09

974 commits to main branch, last one 5 months ago

zero_nlp yuanzhoulvpi2017

396

3.3k

mit

31

中文nlp解决方案(大模型、数据、模型、训练、推理)

gpt nlp bert clip gpt2 llama llava llama2 pytorch chatglm-6b transformers text-generation huggingface-transformers

Created 2023-02-05

249 commits to main branch, last one about a month ago

clip-interrogator pharmapsychotic

434

2.8k

mit

31

Image to prompt with BLIP and CLIP

clip pytorch

Created 2022-08-09

98 commits to main branch, last one about a year ago

VLM_survey jingyi0000

202

2.6k

unknown

99

Collection of AWESOME vision-language models for vision tasks

clip survey deep-learning computer-vision multi-modal-model transfer-learning vision-language-model knowledge-distillation

Created 2023-03-30

91 commits to main branch, last one 7 days ago

clip-retrieval rom1504

222

2.5k

mit

25

Easily compute clip embeddings and build a clip retrieval system with them

ai knn clip multimodal deep-learning semantic-search

Created 2021-06-07

332 commits to main branch, last one about a year ago

VLMEvalKit open-compass

305

2.1k

apache-2.0

12

Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks

gpt llm vit vqa clip gpt4 qwen llava claude gemini gpt-4v openai chatgpt pytorch evaluation openai-api multi-modal computer-vision large-language-models

Created 2023-12-01

1,253 commits to main branch, last one a day ago

RWidgetHelper RuffianZhong

176

1.9k

unknown

30

Android UI 快速开发，专治原生控件各种不服

Created 2018-04-26

186 commits to master branch, last one about a year ago

cambrian cambrian-mllm

129

1.9k

apache-2.0

23

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

clip dino llms mllm chatbot computer-vision instruction-tuning large-language-models representation-learning multimodal-large-language-models

Created 2024-06-17

59 commits to main branch, last one 5 months ago

awesome-openai-vision-api-experiments roboflow

134

1.7k

unknown

26

Must-have resource for anyone who wants to experiment with and build on the OpenAI vision API 🔥

clip openai chatgpt zero-shot classification grounding-dino computer-vision segment-anything open-vocabulary-detection open-vocabulary-segmentation

Created 2023-11-07

52 commits to main branch, last one 4 months ago

hcaptcha-challenger QIN2DIM

258

1.5k

gpl-3.0

22

🥂 Gracefully face hCaptcha challenge with multimodal large language model.

clip onnx yolo solver yolov5 hcaptcha playwright multi-modal onnx-models onnxruntime opencv-python computer-vision hcaptcha-solver object-detection image-segmentation multi-modal-learning zero-shot-classification

Created 2022-02-15

886 commits to main branch, last one 15 hours ago

Video-ChatGPT mbzuai-oryx

111

1.3k

cc-by-4.0

14

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for ...

clip gpt-4 llama llava vicuna chatbot mulit-modal video-chatboat vision-language video-conversation vision-language-pretraining

Created 2023-05-18

44 commits to main branch, last one a day ago

Awesome-CLIP yzhuoning

57

1.2k

unknown

19

Awesome list for research on CLIP (Contrastive Language-Image Pre-Training).

clip pre-training contrastive-learning

Created 2021-09-05

56 commits to main branch, last one 2 years ago

uform unum-cloud

63

1.1k

apache-2.0

15

Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️

Created 2023-02-21

298 commits to main branch, last one 2 months ago

vlms-zero-to-hero SkalskiP

97

1.1k

apache-2.0

44

This series will take you on a journey from the fundamentals of NLP and Computer Vision to the cutting edge of Vision-Language Models.

gpt clip lora gpt-2 seq2seq word2vec bert-model embeddings computer-vision vision-language-model natural-language-processing

Created 2024-12-20

6 commits to master branch, last one 2 months ago

Stable-Diffusion-NCNN EdVince

99

1.0k

bsd-3-clause

25

Stable Diffusion in NCNN with c++, supported txt2img and img2img

cpp mnn tnn clip ncnn onnx android img2img txt2img tensorrt diffusion executable stable-diffusion

Created 2022-11-11

60 commits to main branch, last one about a year ago

natural-language-image-search haltakov

103

1.0k

mit

11

Search photos on Unsplash using natural language

clip photos unsplash image-search computer-vision machine-learning

Created 2021-01-16

65 commits to main branch, last one 2 years ago

CLIP4Clip ArrowLuo

126

928

mit

12

An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"

clip msvd lsmdc didemo msrvtt search ranking retrieval multimodal activitynet multimodality retrieval-model multimodal-learning video-clip-retrieval video-text-retrieval

Created 2021-04-13

29 commits to master branch, last one 2 years ago

natural-language-youtube-search haltakov

71

925

mit

14

Search inside YouTube videos using natural language

clip search youtube computer-vision machine-learning

Created 2021-02-01

20 commits to main branch, last one 3 years ago

Text2LIVE omerbt

79

888

mit

28

Official Pytorch Implementation for "Text2LIVE: Text-Driven Layered Image and Video Editing" (ECCV 2022 Oral)

clip eccv2022 text2live single-image single-video image-editing video-editing generative-model image-manipulation text-driven-editing

Created 2022-07-26

9 commits to main branch, last one 2 years ago

Transformer-MM-Explainability hila-chefer

110

840

mit

8

[ICCV 2021- Oral] Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-bas...

vqa clip detr lxmert visualbert transformer transformers visualization explainability explainable-ai interpretability

Created 2021-03-23

77 commits to main branch, last one 2 years ago

aphantasia eps696

103

786

mit

22

CLIP + FFT/DWT/RGB = text to image/video

clip text-to-image text-to-video

Created 2021-02-28

180 commits to master branch, last one about a month ago

awesome-vlm-architectures gokayfem

38

751

cc0-1.0

15

Famous Vision Language Models and Their Architectures

vlm blip clip llava cogvlm kosmos awesome qwen-vl internlm multimodal awesome-list text-encoder image-encoder vision-language-model

Created 2024-02-15

240 commits to main branch, last one about a month ago

openscene pengsongyou

55

705

apache-2.0

18

[CVPR'23] OpenScene: 3D Scene Understanding with Open Vocabularies

llm clip scannet cvpr2023 nuscenes matterport3d point-clouds semantic-segmentation 3d-scene-understanding point-cloud-segmentation

Created 2023-03-18

15 commits to main branch, last one about a year ago

SkyPaint-AI-Diffusion SkyWorkAIGC

37

651

mit

11

基于Stable Diffusion优化的AI绘画模型。支持输入中英文文本，可生成多种现代艺术风格的高质量图像。| An optimized text-to-image model based on Stable Diffusion. Both Chinese and English text inputs are available to generate images. The model c...

cv aigc bert clip dalle2 openai pytorch diffusion dreambooth midjourney text2image ai-painting text-to-image generative-art latent-diffusion machine-learning stable-diffusion artificial-intelligence

Created 2022-12-13

36 commits to main branch, last one 2 years ago

DeCLIP Sense-GVT

32

649

unknown

21

Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm

clip big-model zero-shot image-text multi-model self-supervised vision-language-pretraining

Created 2021-10-09

34 commits to main branch, last one 2 years ago