56 results found Sort:

508
6.1k
apache-2.0
57
SGLang is a fast serving framework for large language models and vision language models.
Created 2024-01-08
1,235 commits to main branch, last one 11 hours ago
542
3.7k
apache-2.0
419
Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML models. It supports text generation, image generation, vision-language models (VLM), auto-speech-recognition (ASR), and text-to-speech ...
Created 2024-08-16
1,010 commits to main branch, last one 20 hours ago
161
1.9k
mit
26
The Cradle framework is a first attempt at General Computer Control (GCC). Cradle supports agents to ace any computer task by enabling strong reasoning abilities, self-improvment, and skill curation, ...
Created 2024-03-03
35 commits to main branch, last one 10 days ago
An AI-powered file management tool that ensures privacy by organizing local texts, images. Using Llama3.2 3B and Llava v1.6 models with the Nexa SDK, it intuitively scans, restructures, and organizes ...
Created 2024-09-21
33 commits to main branch, last one about a month ago
157
1.4k
apache-2.0
31
[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Created 2023-10-16
935 commits to main branch, last one 5 days ago
LLM Agent Framework in ComfyUI includes Omost,GPT-sovits, ChatTTS,GOT-OCR2.0, and FLUX prompt nodes,access to Feishu,discord,and adapts to all llms with similar openai/gemini interfaces, such as o1,ol...
Created 2024-04-13
2,232 commits to main branch, last one 22 hours ago
62
940
apache-2.0
22
A reading list for large models safety, security, and privacy (including Awesome LLM Security, Safety, etc.).
Created 2024-01-09
401 commits to main branch, last one 2 days ago
69
932
apache-2.0
19
A family of lightweight multimodal models.
Created 2024-01-31
113 commits to main branch, last one 27 days ago
Aircraft design optimization made fast through computational graph transformations (e.g., automatic differentiation). Composable analysis tools for aerodynamics, propulsion, structures, trajectory des...
Created 2019-05-15
4,275 commits to master branch, last one 3 months ago
A curated list of 3D Vision papers relating to Robotics domain in the era of large models i.e. LLMs/VLMs, inspired by awesome-computer-vision, including papers, codes, and related websites
Created 2024-08-12
41 commits to main branch, last one 13 days ago
36
446
unknown
11
[CVPR 2024 🔥] GeoChat, the first grounded Large Vision Language Model for Remote Sensing
Created 2023-11-23
70 commits to main branch, last one 7 months ago
Famous Vision Language Models and Their Architectures
Created 2024-02-15
231 commits to main branch, last one 2 months ago
Custom ComfyUI nodes for Vision Language Models, Large Language Models, Image to Music, Text to Music, Consistent and Random Creative Prompt Generation
Created 2024-01-24
271 commits to main branch, last one 11 days ago
Awesome-Jailbreak-on-LLMs is a collection of state-of-the-art, novel, exciting jailbreak methods on LLMs. It contains papers, codes, datasets, evaluations, and analyses.
Created 2024-06-27
147 commits to main branch, last one 18 hours ago
ScreenAgent: A Computer Control Agent Driven by Visual Language Large Model (IJCAI-24)
Created 2024-01-15
53 commits to main branch, last one 2 months ago
A curated list of awesome papers on Embodied AI and related research/industry-driven resources.
Created 2023-07-21
38 commits to main branch, last one 3 months ago
31
248
apache-2.0
7
A streamlined and customizable framework for efficient large model evaluation and performance benchmarking
Created 2023-12-07
198 commits to main branch, last one 5 days ago
Phi-3.5 for Mac: Locally-run Vision and Language Models for Apple Silicon
Created 2024-05-27
57 commits to main branch, last one 2 months ago
[NeurIPS'24 Spotlight] EVE: Encoder-Free Vision-Language Models
Created 2024-06-14
19 commits to main branch, last one about a month ago
Awesome LLM Papers and repos on very comprehensive topics.
Created 2024-01-13
171 commits to main branch, last one 2 months ago
15
183
apache-2.0
9
Official code for Paper "Mantis: Multi-Image Instruction Tuning" (TMLR2024)
Created 2024-04-12
210 commits to main branch, last one 2 days ago
Ptera Software is a fast, easy-to-use, and open-source software package for analyzing flapping-wing flight.
Created 2020-03-23
751 commits to develop branch, last one about a year ago
21
163
apache-2.0
4
Seamlessly integrate state-of-the-art transformer models into robotics stacks
Created 2024-05-22
82 commits to main branch, last one 24 days ago
16
158
apache-2.0
6
RAI is a multi-vendor agent framework for robotics, utilizing Langchain and ROS 2 tools to perform complex actions, defined scenarios, free interface execution, log summaries, voice interaction and mo...
Created 2024-06-04
220 commits to development branch, last one 2 days ago
llama.cpp (GGUF LLMs) and llava.cpp (GGUF VLMs) for ROS 2
Created 2023-04-01
673 commits to main branch, last one a day ago
3
155
apache-2.0
5
LLaRA: Large Language and Robotics Assistant
Created 2024-06-07
20 commits to main branch, last one about a month ago
15
148
apache-2.0
4
PsyDI: Towards a Personalized and Progressively In-depth Chatbot for Psychological Measurements. (e.g. MBTI Measurement Agent)
Created 2024-04-12
198 commits to main branch, last one 3 days ago
[NeurIPS 2023 Oral] Quilt-1M: One Million Image-Text Pairs for Histopathology.
Created 2023-06-07
9 commits to main branch, last one 10 months ago
1
130
apache-2.0
4
JoyCaption is an image captioning Visual Language Model (VLM) being built from the ground up as a free, open, and uncensored model for the community to use in training Diffusion models.
Created 2024-10-12
3 commits to main branch, last one about a month ago
DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception
Created 2024-07-05
8 commits to main branch, last one about a month ago