23 results found Sort:
- Filter by Primary Language:
- Python (14)
- C++ (1)
- Julia (1)
- Jupyter Notebook (1)
- Markdown (1)
- TypeScript (1)
- +
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Created
2023-10-16
854 commits to main branch, last one a day ago
A family of lightweight multimodal models.
Created
2024-01-31
86 commits to main branch, last one a day ago
Aircraft design optimization made fast through modern automatic differentiation. Composable analysis tools for aerodynamics, propulsion, structures, trajectory design, and much more.
Created
2019-05-15
4,236 commits to master branch, last one about a month ago
The Cradle framework is a first attempt at General Computer Control (GCC). Cradle supports agents to ace any computer task by enabling strong reasoning abilities, self-improvment, and skill curation, ...
Created
2024-03-03
8 commits to main branch, last one about a month ago
A reading list for large models safety, security, and privacy.
Created
2024-01-09
299 commits to main branch, last one 9 hours ago
[CVPR 2024 🔥] GeoChat, the first grounded Large Vision Language Model for Remote Sensing
Created
2023-11-23
70 commits to main branch, last one 2 months ago
Custom ComfyUI nodes for Vision Language Models, Large Language Models, Image to Music, Text to Music, Consistent and Random Creative Prompt Generation
Created
2024-01-24
227 commits to main branch, last one a day ago
ScreenAgent: A Computer Control Agent Driven by Visual Language Large Model
Created
2024-01-15
45 commits to main branch, last one 3 days ago
Ptera Software is a fast, easy-to-use, and open-source software package for analyzing flapping-wing flight.
Created
2020-03-23
751 commits to develop branch, last one 10 months ago
A curated list of awesome papers on Embodied AI and related research/industry-driven resources.
Created
2023-07-21
37 commits to main branch, last one about a month ago
[NeurIPS 2023 Oral] Quilt-1M: One Million Image-Text Pairs for Histopathology.
Created
2023-06-07
9 commits to main branch, last one 4 months ago
Famous Vision Language Models and Their Architectures
Created
2024-02-15
220 commits to main branch, last one 4 days ago
llama.cpp (GGUF LLMs) and llava.cpp (GGUF VLMs) for ROS 2
Created
2023-04-01
400 commits to main branch, last one a day ago
Official code for Paper "Mantis: Multi-Image Instruction Tuning"
Created
2024-04-12
142 commits to main branch, last one 2 days ago
🧘🏻♂️KarmaVLM (相生):A family of high efficiency and powerful visual language model.
Created
2024-01-23
45 commits to main branch, last one about a month ago
Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detection and segmentation.
Created
2024-05-21
19 commits to main branch, last one 2 days ago
Jailbreaking Large Vision-language Models via Typographic Visual Prompts
Created
2023-11-08
60 commits to main branch, last one 28 days ago
This repo is a live list of papers on game playing and large multimodality model - "A Survey on Game Playing Agents and Large Models: Methods, Applications, and Challenges".
Created
2024-02-01
5 commits to main branch, last one about a month ago
M3DBench introduces a comprehensive 3D instruction-following dataset with support for interleaved multi-modal prompts. Furthermore, M3DBench provides a new benchmark to assess large models across 3D v...
Created
2023-12-08
35 commits to main branch, last one 5 months ago
A toolbox meant for aircraft design analyses.
Created
2021-01-26
976 commits to main branch, last one 27 days ago
A system for prompted weak supervision.
Created
2022-12-20
228 commits to main branch, last one 12 days ago
[ICRA 2024] Dream2Real: Zero-Shot 3D Object Rearrangement with Vision-Language Models
Created
2024-02-07
6 commits to main branch, last one 3 months ago
PsyDI: A MBTI agent that helps you understand your personality type through a relaxed multi-modal interaction.
Created
2024-04-12
170 commits to main branch, last one 11 days ago