39 results found Sort:

2.2k
20.3k
apache-2.0
158
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Created 2023-04-17
460 commits to main branch, last one 6 months ago
2.1k
12.5k
other
218
🏄 Scalable embedding, reasoning, ranking for images and sentences with CLIP
Created 2018-11-12
1,960 commits to main branch, last one 11 months ago
Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network). Technique was originally created by https://twitter.com/advadnoun
Created 2021-01-17
231 commits to main branch, last one 2 years ago
243
3.6k
mit
100
🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
Created 2023-04-01
626 commits to main branch, last one 8 months ago
154
2.5k
apache-2.0
43
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Created 2023-09-26
395 commits to main branch, last one about a month ago
240
1.8k
agpl-3.0
36
The Enterprise-Grade Production-Ready Multi-Agent Orchestration Framework Join our Community: https://discord.com/servers/agora-999382051935506503
Created 2023-05-11
3,186 commits to master branch, last one a day ago
Algorithms and Publications on 3D Object Tracking
Created 2020-09-21
40 commits to master branch, last one about a year ago
Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, BLIP...
Created 2023-05-10
86 commits to main branch, last one 7 months ago
56
429
mit
13
The open source implementation of Gemini, the model that will "eclipse ChatGPT" by Google
Created 2023-08-29
149 commits to main branch, last one 5 months ago
[CVPR 2023] Collaborative Diffusion
Created 2023-03-22
15 commits to master branch, last one 11 months ago
30
403
apache-2.0
9
Parsing-free RAG supported by VLMs
Created 2024-10-14
66 commits to master branch, last one a day ago
[CVPR'23] MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation
Created 2022-12-11
16 commits to main branch, last one 5 months ago
An open-source implementation for training LLaVA-NeXT.
Created 2024-05-11
36 commits to master branch, last one 29 days ago
26
378
apache-2.0
8
Effortless plugin and play Optimizer to cut model training costs by 50%. New optimizer that is 2x faster than Adam on LLMs.
Created 2023-05-24
58 commits to main branch, last one 5 months ago
An official PyTorch implementation of the CRIS paper
Created 2022-06-01
31 commits to master branch, last one 7 months ago
6
235
unknown
2
[CVPR'24] RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback
Created 2023-11-29
73 commits to main branch, last one 2 months ago
18
227
unknown
6
Unifying Voxel-based Representation with Transformer for 3D Object Detection (NeurIPS 2022)
Created 2022-06-01
13 commits to main branch, last one 2 years ago
This repo contains the official code of our work SAM-SLR which won the CVPR 2021 Challenge on Large Scale Signer Independent Isolated Sign Language Recognition.
Created 2021-03-15
89 commits to main branch, last one 2 years ago
Official code for NeurIPS2023 paper: CoDA: Collaborative Novel Box Discovery and Cross-modal Alignment for Open-vocabulary 3D Object Detection
Created 2023-10-05
83 commits to main branch, last one 28 days ago
12
176
apache-2.0
3
Embed arbitrary modalities (images, audio, documents, etc) into large language models.
Created 2023-10-11
84 commits to main branch, last one 7 months ago
20
155
apache-2.0
21
An open-source cloud-native of large multi-modal models (LMMs) serving framework.
Created 2023-04-04
456 commits to main branch, last one about a year ago
19
144
apache-2.0
4
Seed, Code, Harvest: Grow Your Own App with Tree of Thoughts!
Created 2023-05-22
48 commits to main branch, last one about a year ago
(NeurIPS 2022 CellSeg Challenge - 1st Winner) Open source code for "MEDIAR: Harmony of Data-Centric and Model-Centric for Multi-Modality Microscopy"
Created 2022-11-16
79 commits to main branch, last one 7 months ago
20
137
gpl-3.0
9
An all-new Language Model That Processes Ultra-Long Sequences of 100,000+ Ultra-Fast
Created 2023-05-05
428 commits to master branch, last one 8 months ago
[CVPR 2024] Prompt Highlighter: Interactive Control for Multi-Modal LLMs
Created 2023-11-28
18 commits to main branch, last one 4 months ago
Implementation of MambaByte in "MambaByte: Token-free Selective State Space Model" in Pytorch and Zeta
Created 2024-01-26
9 commits to main branch, last one 9 months ago
14
92
mit
3
Multi-modal Graph learning for Disease Prediction (IEEE Trans. on Medical imaging, TMI2022)
Created 2022-01-01
66 commits to main branch, last one 8 months ago
Implementation of MoE Mamba from the paper: "MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts" in Pytorch and Zeta
Created 2024-01-21
14 commits to main branch, last one 10 months ago
9
79
unknown
3
[TCSVT] CorrI2P: Deep Image-to-Point Cloud Registration via Dense CorrespondenceThe code of CorrI2P
Created 2022-08-25
41 commits to main branch, last one about a month ago