5 results found Sort:
A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.
Created
2024-06-13
32 commits to main branch, last one 25 days ago
[CVPR 2024 Highlight] OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation
Created
2023-11-30
50 commits to main branch, last one 3 months ago
RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness
Created
2024-05-13
57 commits to main branch, last one 14 days ago
The official code of the paper "Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration Rate".
Created
2024-10-09
17 commits to main branch, last one 24 days ago
[CVPR 2024] Situational Awareness Matters in 3D Vision Language Reasoning
Created
2024-06-11
16 commits to main branch, last one 12 days ago