23 results found Sort:

121
985
apache-2.0
24
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Created 2023-10-16
854 commits to main branch, last one a day ago
55
706
apache-2.0
19
A family of lightweight multimodal models.
Created 2024-01-31
86 commits to main branch, last one a day ago
Aircraft design optimization made fast through modern automatic differentiation. Composable analysis tools for aerodynamics, propulsion, structures, trajectory design, and much more.
Created 2019-05-15
4,236 commits to master branch, last one about a month ago
The Cradle framework is a first attempt at General Computer Control (GCC). Cradle supports agents to ace any computer task by enabling strong reasoning abilities, self-improvment, and skill curation, ...
Created 2024-03-03
8 commits to main branch, last one about a month ago
27
463
apache-2.0
20
A reading list for large models safety, security, and privacy.
Created 2024-01-09
299 commits to main branch, last one 9 hours ago
18
307
unknown
7
[CVPR 2024 🔥] GeoChat, the first grounded Large Vision Language Model for Remote Sensing
Created 2023-11-23
70 commits to main branch, last one 2 months ago
Custom ComfyUI nodes for Vision Language Models, Large Language Models, Image to Music, Text to Music, Consistent and Random Creative Prompt Generation
Created 2024-01-24
227 commits to main branch, last one a day ago
ScreenAgent: A Computer Control Agent Driven by Visual Language Large Model
Created 2024-01-15
45 commits to main branch, last one 3 days ago
Ptera Software is a fast, easy-to-use, and open-source software package for analyzing flapping-wing flight.
Created 2020-03-23
751 commits to develop branch, last one 10 months ago
A curated list of awesome papers on Embodied AI and related research/industry-driven resources.
Created 2023-07-21
37 commits to main branch, last one about a month ago
[NeurIPS 2023 Oral] Quilt-1M: One Million Image-Text Pairs for Histopathology.
Created 2023-06-07
9 commits to main branch, last one 4 months ago
Famous Vision Language Models and Their Architectures
Created 2024-02-15
220 commits to main branch, last one 4 days ago
llama.cpp (GGUF LLMs) and llava.cpp (GGUF VLMs) for ROS 2
Created 2023-04-01
400 commits to main branch, last one a day ago
5
92
apache-2.0
7
Official code for Paper "Mantis: Multi-Image Instruction Tuning"
Created 2024-04-12
142 commits to main branch, last one 2 days ago
3
83
apache-2.0
1
🧘🏻‍♂️KarmaVLM (相生):A family of high efficiency and powerful visual language model.
Created 2024-01-23
45 commits to main branch, last one about a month ago
Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detection and segmentation.
Created 2024-05-21
19 commits to main branch, last one 2 days ago
Jailbreaking Large Vision-language Models via Typographic Visual Prompts
Created 2023-11-08
60 commits to main branch, last one 28 days ago
2
50
unknown
2
This repo is a live list of papers on game playing and large multimodality model - "A Survey on Game Playing Agents and Large Models: Methods, Applications, and Challenges".
Created 2024-02-01
5 commits to main branch, last one about a month ago
1
43
unknown
3
M3DBench introduces a comprehensive 3D instruction-following dataset with support for interleaved multi-modal prompts. Furthermore, M3DBench provides a new benchmark to assess large models across 3D v...
Created 2023-12-08
35 commits to main branch, last one 5 months ago
A toolbox meant for aircraft design analyses.
Created 2021-01-26
976 commits to main branch, last one 27 days ago
5
42
bsd-3-clause
3
A system for prompted weak supervision.
Created 2022-12-20
228 commits to main branch, last one 12 days ago
6
42
unknown
5
[ICRA 2024] Dream2Real: Zero-Shot 3D Object Rearrangement with Vision-Language Models
Created 2024-02-07
6 commits to main branch, last one 3 months ago
2
41
apache-2.0
2
PsyDI: A MBTI agent that helps you understand your personality type through a relaxed multi-modal interaction.
Created 2024-04-12
170 commits to main branch, last one 11 days ago