9 results found Sort:

2.1k
19.5k
apache-2.0
160
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Created 2023-04-17
460 commits to main branch, last one 4 months ago
242
3.6k
mit
100
🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
Created 2023-04-01
626 commits to main branch, last one 6 months ago
319
3.2k
bsd-3-clause
57
Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model
Created 2023-08-30
225 commits to main branch, last one 8 months ago
153
2.5k
apache-2.0
41
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Created 2023-09-26
394 commits to main branch, last one 29 days ago
27
264
bsd-3-clause
12
(AAAI 2024) BLIVA: A Simple Multimodal LLM for Better Handling of Text-rich Visual Questions
Created 2023-08-02
26 commits to main branch, last one 5 months ago
An open-source implementation for training LLaVA-NeXT.
Created 2024-05-11
33 commits to master branch, last one 2 days ago
6
220
unknown
2
[CVPR'24] RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback
Created 2023-11-29
73 commits to main branch, last one 17 days ago
3
83
apache-2.0
1
🧘🏻‍♂️KarmaVLM (相生):A family of high efficiency and powerful visual language model.
Created 2024-01-23
45 commits to main branch, last one 5 months ago