Search Results - RepositoryStats

MiniCPM-o OpenBMB

1.4k

19.2k

apache-2.0

139

MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone

minicpm minicpm-v multi-modal

Created 2024-01-29

524 commits to main branch, last one about a month ago

deeplake activeloopai

656

8.5k

apache-2.0

93

Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop....

Created 2019-08-09

9,213 commits to main branch, last one 16 days ago

modelscope modelscope

794

7.7k

apache-2.0

80

ModelScope: bring the notion of Model-as-a-Service to life.

cv nlp python speech science multi-modal deep-learning machine-learning

Created 2022-07-25

2,714 commits to master branch, last one 6 days ago

InternVL OpenGVLab

574

7.5k

mit

59

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

gpt llm gpt-4o gpt-4v vit-6b vit-22b multi-modal image-classification image-text-retrieval video-classification semantic-segmentation vision-language-model

Created 2023-11-22

237 commits to main branch, last one 3 days ago

agentscope modelscope

403

7.0k

apache-2.0

37

Start building LLM-empowered multi-agent applications in an easier way.

llm mcp agent gpt-4 gpt-4o llama3 chatbot llm-agent multi-agent multi-modal drag-and-drop distributed-agents large-language-models

Created 2024-01-12

320 commits to main branch, last one 21 hours ago

CogVLM THUDM

430

6.5k

apache-2.0

70

a state-of-the-art-level open visual language model | 多模态预训练模型

multi-modal cross-modality language-model pretrained-models visual-language-models

Created 2023-09-18

184 commits to main branch, last one 10 months ago

DALLE-pytorch lucidrains

640

5.6k

mit

95

Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch

multi-modal transformers deep-learning text-to-image attention-mechanism artificial-intelligence

Created 2021-01-05

540 commits to main branch, last one about a year ago

Chinese-CLIP OFA-Sys

492

5.1k

mit

37

Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.

nlp clip chinese pytorch multi-modal transformers coreml-models deep-learning computer-vision vision-language contrastive-loss pretrained-models image-text-retrieval multi-modal-learning vision-and-language-pre-training

Created 2022-07-08

382 commits to master branch, last one 8 months ago

marqo marqo-ai

202

4.8k

apache-2.0

39

Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai

Created 2022-08-01

1,546 commits to mainline branch, last one a day ago

valhalla valhalla

724

4.8k

other

105

Open Source Routing Engine for OpenStreetMap

astar tiled routing dijkstra directions isochrones multi-modal openstreetmap routing-engine traveling-salesman

Created 2016-01-19

14,144 commits to master branch, last one a day ago

data-juicer modelscope

226

4.2k

apache-2.0

20

Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷

Created 2023-08-01

349 commits to main branch, last one 3 days ago

VisualGLM-6B THUDM

424

4.2k

apache-2.0

42

Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型

gpt chatglm-6b multi-modal

Created 2023-04-23

95 commits to main branch, last one 7 months ago

OmniGen VectorSpaceLab

338

3.9k

mit

85

OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340

image diffusion image-edit multi-task multi-modal image-generation

Created 2024-09-16

154 commits to main branch, last one about a month ago

DeepKE zjunlp

715

3.9k

mit

45

[EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and Construction

Created 2018-08-01

1,696 commits to main branch, last one about a month ago

Video-LLaVA PKU-YuanGroup

233

3.2k

apache-2.0

30

【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

multi-modal instruction-tuning large-vision-language-model

Created 2023-10-23

154 commits to main branch, last one 4 months ago

LLamaSharp SciSharp

413

3.1k

mit

58

A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.

gpt llm llama llava llama2 llama3 chatbot llamacpp llama-cpp multi-modal semantic-kernel

Created 2023-05-09

1,936 commits to master branch, last one 4 days ago

docarray docarray

233

3.0k

apache-2.0

45

Represent, send, store and search multimodal data

qdrant fastapi pytorch docarray protobuf pydantic weaviate dataclass multimodal cross-modal multi-modal nested-data deep-learning elasticsearch neural-search data-structures semantic-search machine-learning nearest-neighbor-search

Created 2021-12-14

1,467 commits to main branch, last one 24 days ago

CogVLM2 THUDM

153

2.3k

apache-2.0

29

GPT4V-level open-source multi-modal model based on Llama3-8B

cogvlm multi-modal language-model pretrained-models

Created 2024-05-10

87 commits to main branch, last one about a month ago

VLMEvalKit open-compass

325

2.2k

apache-2.0

11

Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks

gpt llm vit vqa clip gpt4 qwen llava claude gemini gpt-4v openai chatgpt pytorch evaluation openai-api multi-modal computer-vision large-language-models

Created 2023-12-01

1,285 commits to main branch, last one a day ago

LISA dvlab-research

150

2.1k

apache-2.0

12

Project Page for "LISA: Reasoning Segmentation via Large Language Model"

llm multi-modal segmentation large-language-model

Created 2023-08-01

139 commits to main branch, last one 3 months ago

MoE-LLaVA PKU-YuanGroup

134

2.1k

apache-2.0

23

Mixture-of-Experts for Large Vision-Language Models

moe multi-modal mixture-of-experts large-vision-language-model

Created 2023-12-14

228 commits to main branch, last one 4 months ago

GPTDiscord Kav-K

293

1.8k

mit

30

A robust, all-in-one GPT interface for Discord. ChatGPT-style conversations, image generation, AI-moderation, custom indexes/knowledgebase, youtube summarizer, and more!

Created 2022-12-08

1,380 commits to main branch, last one 11 months ago

RecSysPapers tangxyw

240

1.7k

bsd-2-clause

65

推荐/广告/搜索领域工业界经典以及最前沿论文集合。A collection of industry classics and cutting-edge papers in the field of recommendation/advertising/search.

Created 2022-08-16

187 commits to main branch, last one 21 hours ago

MotionGPT OpenMotionLab

104

1.6k

mit

44

[NeurIPS 2023] MotionGPT: Human Motion as a Foreign Language, a unified motion-language generation model using LLMs

gpt motion chatgpt motiongpt multi-modal text-driven 3d-generation language-model text-to-motion motion-generation

Created 2023-06-20

74 commits to main branch, last one about a year ago

fastRAG IntelLabs

139

1.5k

apache-2.0

15

Efficient Retrieval Augmentation and Generation Framework

llm nlp colbert benchmark diffusion multi-modal transformers generative-ai summarization knowledge-graph semantic-search question-answering information-retrieval sentence-transformers

Created 2023-01-23

73 commits to main branch, last one 4 months ago

all-rag-techniques FareedKhan-dev

218

1.5k

mit

20

Implementation of all RAG techniques in a simpler way

ai llm rag llms openai python multi-modal

Created 2025-03-07

39 commits to main branch, last one 21 days ago

Transformer-in-Vision DirtyHarryLYL

142

1.3k

unknown

85

Recent Transformer-based CV and related works.

paper multi-modal transformer deep-learning self-attention computer-vision visual-language vision-transformers

Created 2021-02-11

828 commits to main branch, last one about a year ago

modelfusion vercel

89

1.3k

mit

12

The TypeScript library for building AI applications.

Created 2023-05-25

2,522 commits to main branch, last one 11 months ago

SALMONN bytedance

96

1.2k

apache-2.0

26

SALMONN: Speech Audio Language Music Open Neural Network

audio music speech iclr2024 research bytedance icml-2024 multi-modal audio-processing speech-recognition tsinghua-university large-language-models

Created 2023-08-11

63 commits to main branch, last one about a month ago

MedMNIST MedMNIST

177

1.2k

apache-2.0

14

[pip install medmnist] 18x Standardized Datasets for 2D and 3D Biomedical Image Classification

2d 3d mnist automl dataset medical pytorch medmnist benchmark decathlon multi-modal deep-learning classification medical-imaging machine-learning few-shot-learning federated-learning medical-image-analysis medical-image-computing

Created 2020-10-25

124 commits to main branch, last one 3 months ago