6 results found Sort:

An open-source implementation for training LLaVA-NeXT.
Created 2024-05-11
36 commits to master branch, last one 5 months ago
13
324
unknown
5
[CVPR'25] RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness
Created 2024-05-13
62 commits to main branch, last one 21 days ago
28
278
apache-2.0
8
A minimal codebase for finetuning large multimodal models, supporting llava-1.5/1.6, llava-interleave, llava-next-video, llava-onevision, llama-3.2-vision, qwen-vl, qwen2-vl, phi3-v etc.
Created 2024-07-20
109 commits to main branch, last one about a month ago
6
98
apache-2.0
4
Matryoshka Multimodal Models
Created 2024-05-27
477 commits to main branch, last one 2 months ago
LLaVA-NeXT-Image-Llama3-Lora, Modified from https://github.com/arielnlee/LLaVA-1.6-ft
Created 2024-06-24
6 commits to main branch, last one 8 months ago
[AAAI 2025] HiRED strategically drops visual tokens in the image encoding stage to improve inference efficiency for High-Resolution Vision-Language Models (e.g., LLaVA-Next) under a fixed token budget...
Created 2024-08-19
19 commits to main branch, last one about a month ago