74 results found Sort:

794
8.2k
apache-2.0
72
SGLang is a fast serving framework for large language models and vision language models.
Created 2024-01-08
1,914 commits to main branch, last one 5 hours ago
976
6.3k
unknown
81
This repository offers a comprehensive collection of tutorials on state-of-the-art computer vision models and techniques. Explore everything from foundational architectures like ResNet to cutting-edge...
Created 2022-11-18
517 commits to main branch, last one 10 hours ago
535
4.7k
gpl-3.0
33
Effortless data labeling with AI support from Segment Anything and other awesome models.
Created 2023-05-23
652 commits to main branch, last one 18 days ago
615
4.3k
apache-2.0
425
Nexa SDK is a comprehensive toolkit for supporting GGML and ONNX models. It supports text generation, image generation, vision-language models (VLM), Audio Language Model, auto-speech-recognition (ASR...
Created 2024-08-16
1,231 commits to main branch, last one 12 hours ago
142
2.2k
apache-2.0
40
A GUI Agent application based on UI-TARS(Vision-Lanuage Model) that allows you to control your computer using natural language.
Created 2025-01-19
59 commits to main branch, last one 2 days ago
172
2.0k
mit
27
The Cradle framework is a first attempt at General Computer Control (GCC). Cradle supports agents to ace any computer task by enabling strong reasoning abilities, self-improvment, and skill curation, ...
Created 2024-03-03
35 commits to main branch, last one 2 months ago
An AI-powered file management tool that ensures privacy by organizing local texts, images. Using Llama3.2 3B and Llava v1.6 models with the Nexa SDK, it intuitively scans, restructures, and organizes ...
Created 2024-09-21
33 commits to main branch, last one 4 months ago
167
1.6k
apache-2.0
32
[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Created 2023-10-16
945 commits to main branch, last one 12 days ago
137
1.4k
apache-2.0
50
Build multimodal language agents for fast prototype and production
Created 2024-07-04
416 commits to main branch, last one 4 days ago
🚀🚀🚀 A collection of some awesome public YOLO object detection series projects and the related object detection datasets.
Created 2022-02-19
398 commits to main branch, last one 22 days ago
LLM Agent Framework in ComfyUI includes MCP sever, Omost,GPT-sovits, ChatTTS,GOT-OCR2.0, and FLUX prompt nodes,access to Feishu,discord,and adapts to all llms with similar openai / aisuite interfaces,...
Created 2024-04-13
2,434 commits to main branch, last one 6 days ago
73
1.1k
apache-2.0
22
A reading list for large models safety, security, and privacy (including Awesome LLM Security, Safety, etc.).
Created 2024-01-09
435 commits to main branch, last one 2 days ago
75
982
apache-2.0
20
A family of lightweight multimodal models.
Created 2024-01-31
114 commits to main branch, last one 2 months ago
Aircraft design optimization made fast through computational graph transformations (e.g., automatic differentiation). Composable analysis tools for aerodynamics, propulsion, structures, trajectory des...
Created 2019-05-15
4,280 commits to master branch, last one 21 days ago
48
640
apache-2.0
18
An open-sourced end-to-end VLM-based GUI Agent
Created 2023-11-28
45 commits to main branch, last one 3 days ago
A curated list of 3D Vision papers relating to Robotics domain in the era of large models i.e. LLMs/VLMs, inspired by awesome-computer-vision, including papers, codes, and related websites
Created 2024-08-12
41 commits to main branch, last one 2 months ago
Famous Vision Language Models and Their Architectures
Created 2024-02-15
231 commits to main branch, last one 4 months ago
🚀🚀🚀A collection of some wesome public projects about Large Language Model(LLM), Vision Language Model(VLM), Vision Language Action(VLA), AI Generated Content(AIGC), the related Datasets and Applica...
Created 2023-02-15
147 commits to main branch, last one 2 days ago
39
499
unknown
11
[CVPR 2024 🔥] GeoChat, the first grounded Large Vision Language Model for Remote Sensing
Created 2023-11-23
72 commits to main branch, last one 2 months ago
Custom ComfyUI nodes for Vision Language Models, Large Language Models, Image to Music, Text to Music, Consistent and Random Creative Prompt Generation
Created 2024-01-24
271 commits to main branch, last one 2 months ago
Awesome-Jailbreak-on-LLMs is a collection of state-of-the-art, novel, exciting jailbreak methods on LLMs. It contains papers, codes, datasets, evaluations, and analyses.
Created 2024-06-27
190 commits to main branch, last one 24 days ago
45
379
apache-2.0
8
A streamlined and customizable framework for efficient large model evaluation and performance benchmarking
Created 2023-12-07
249 commits to main branch, last one 3 days ago
ScreenAgent: A Computer Control Agent Driven by Visual Language Large Model (IJCAI-24)
Created 2024-01-15
54 commits to main branch, last one 2 months ago
A curated list of awesome papers on Embodied AI and related research/industry-driven resources.
Created 2023-07-21
41 commits to main branch, last one 7 days ago
8
294
apache-2.0
7
JoyCaption is an image captioning Visual Language Model (VLM) being built from the ground up as a free, open, and uncensored model for the community to use in training Diffusion models.
Created 2024-10-12
4 commits to main branch, last one 2 months ago
[NeurIPS'24 Spotlight] EVE: Encoder-Free Vision-Language Models
Created 2024-06-14
19 commits to main branch, last one 3 months ago
Phi-3.5 for Mac: Locally-run Vision and Language Models for Apple Silicon
Created 2024-05-27
57 commits to main branch, last one 4 months ago
Awesome LLM Papers and repos on very comprehensive topics.
Created 2024-01-13
171 commits to main branch, last one 5 months ago
15
195
apache-2.0
9
Official code for Paper "Mantis: Multi-Image Instruction Tuning" (TMLR2024)
Created 2024-04-12
290 commits to main branch, last one 3 days ago
25
187
apache-2.0
3
RAI is a multi-vendor agent framework for robotics, utilizing Langchain and ROS 2 tools to perform complex actions, defined scenarios, free interface execution, log summaries, voice interaction and mo...
Created 2024-06-04
269 commits to development branch, last one 9 hours ago