Search Results - RepositoryStats

873

6.4k

mit

179

Reading list for research topics in multimodal machine learning

robotics healthcare reading-list deep-learning computer-vision machine-learning speech-processing multimodal-learning reinforcement-learning representation-learning natural-language-processing

Created 2019-05-27

435 commits to master branch, last one 9 months ago

open_flamingo mlfoundations

300

3.9k

mit

48

An open-source framework for training large multimodal models.

pytorch flamingo deep-learning language-model computer-vision in-context-learning multimodal-learning

Created 2022-10-20

502 commits to main branch, last one about a year ago

CoOp KaiyangZhou

213

1.9k

mit

15

Prompt Learning for Vision-Language Models (IJCV'22, CVPR'22)

prompt-learning foundation-models multimodal-learning

Created 2021-09-01

66 commits to main branch, last one 2 years ago

Awesome-Multimodal-Research Eurus-Holmes

149

1.3k

mit

40

A curated list of Multimodal Related Research.

awesome multimodal multimodal-learning multimodal-research

Created 2019-07-31

206 commits to master branch, last one about a year ago

UniRepLKNet AILab-CVC

57

975

apache-2.0

13

[CVPR'24] UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition

architecture deep-learning multimodal-learning artificial-intelligence convolutional-neural-networks

Created 2023-11-24

62 commits to main branch, last one 5 months ago

ICCV-2023-Papers DmitryRyumin

43

953

mit

14

ICCV 2023 Papers: Discover cutting-edge research from ICCV 2023, the leading computer vision conference. Stay updated on the latest in computer vision and deep learning, with code included. ⭐ support ...

Created 2023-09-04

1,068 commits to main branch, last one 7 months ago

cornac PreferredAI

153

941

apache-2.0

25

A Comparative Framework for Multimodal Recommender Systems

multimodality recommender-system multimodal-learning matrix-factorization recommendation-engine recommendation-system collaborative-filtering recommendation-algorithms

Created 2018-07-17

1,384 commits to master branch, last one about a month ago

CLIP4Clip ArrowLuo

126

928

mit

12

An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"

clip msvd lsmdc didemo msrvtt search ranking retrieval multimodal activitynet multimodality retrieval-model multimodal-learning video-clip-retrieval video-text-retrieval

Created 2021-04-13

29 commits to master branch, last one 2 years ago

multimodal-deep-learning declare-lab

156

817

mit

7

This repository contains various models targetting multimodal representation learning, multimodal fusion for downstream tasks such as multimodal sentiment analysis.

multimodal-learning multimodal-interactions multimodal-deep-learning multimodal-sentiment-analysis

Created 2021-08-28

95 commits to main branch, last one 2 years ago

Awsome-Deep-Learning-for-Video-Analysis HuaizhengZhang

172

791

mit

33

Papers, code and datasets about deep learning and multi-modal learning for video analysis

paper deep-learning video-dataset video-analysis machine-learning multimodal-learning video-classification

Created 2017-06-14

91 commits to master branch, last one 3 years ago

awesome-multimodal-in-medical-imaging richard-peng-xia

66

703

mit

17

A collection of resources on applications of multi-modal learning in medical imaging.

medical-imaging multimodal-learning large-language-models large-multimodal-models multimodal-deep-learning medical-report-generation visual-question-answering multimodal-large-language-models

Created 2022-07-13

158 commits to main branch, last one about a month ago

ReLA henghuiding

19

692

mit

5

[CVPR2023 Highlight] GRES: Generalized Referring Expression Segmentation

cvpr2023 multimodal-learning vision-language-transformer referring-image-segmentation referring-expression-segmentation referring-expression-comprehension

Created 2023-03-11

14 commits to main branch, last one about a year ago

Multimodal-Toolkit georgian-io

89

602

apache-2.0

23

Multimodal model for text and tabular data with HuggingFace transformers as building block for text data

transformer tabular-data multimodal-learning huggingface-transformers natural-language-processing

Created 2020-08-20

169 commits to master branch, last one 5 months ago

awesome-vision-and-language sangminwoo

41

532

unknown

12

A curated list of awesome vision and language resources (still under construction... stay tuned!)

awesome awesome-list multimodal-learning vision-and-language

Created 2019-10-25

48 commits to master branch, last one 4 months ago

MultiBench pliang279

80

529

mit

15

[NeurIPS 2021] Multiscale Benchmarks for Multimodal Representation Learning

robotics healthcare deep-learning computer-vision machine-learning speech-processing multimodal-learning representation-learning natural-language-processing

Created 2021-03-05

1,258 commits to main branch, last one about a year ago

MeViS henghuiding

22

521

mit

8

[ICCV 2023] MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions

mose-dataset mevis-dataset multimodal-learning video-understanding referring-expression-segmentation referring-expression-comprehension referring-video-object-segmentation

Created 2023-08-01

11 commits to main branch, last one about a year ago

XPretrain microsoft

37

489

other

12

Multi-modality pre-training

nlp multimedia pre-training computer-vision multimodal-learning

Created 2022-03-15

104 commits to main branch, last one 10 months ago

OMML njustkmg

75

470

apache-2.0

28

Multi-Modal learning toolkit based on PaddlePaddle and PyTorch, supporting multiple applications such as multi-modal classification, cross-modal retrieval and image caption.

python pytorch multimodal paddlepaddle classification imagecaptioning multimodal-learning crossmodal-retrieval

Created 2022-01-05

198 commits to Pytorch branch, last one about a year ago

pykale pykale

64

455

mit

10

Knowledge-Aware machine LEarning (KALE): accessible machine learning from multiple sources for interdisciplinary research, part of the 🔥PyTorch ecosystem. ⭐ Star to support our work!

python pytorch multimodal data-science deep-learning meta-learning graph-analysis computer-vision machine-learning domain-adaptation transfer-learning multimodal-learning medical-image-analysis knowledge-aware-learning

Created 2020-06-30

3,103 commits to main branch, last one 13 days ago

ICASSP-2023-24-Papers DmitryRyumin

17

446

mit

31

ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processi...

Created 2023-08-01

1,000 commits to main branch, last one 2 months ago

MMMU MMMU-Benchmark

33

405

apache-2.0

3

This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI"

llm llms stem evaluation multimodal deep-learning multimodality computer-vision machine-learning foundation-models question-answering multimodal-learning deep-neural-networks large-language-models large-multimodal-models multimodal-deep-learning visual-question-answering natural-language-processing

Created 2023-11-23

147 commits to main branch, last one 20 days ago

GPT4Point Pointcept

24

379

mit

25

[CVPR'24 Highlight] GPT4Point: A Unified Framework for Point-Language Understanding and Generation.

llm 3d-generation multimodal-learning

Created 2023-08-12

76 commits to main branch, last one 11 months ago

CM3Leon kyegomez

18

359

mit

21

An open source implementation of "Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning", an all-new multi modal AI that uses just a decoder to generate both text and images

dalle attention multimodal multimodality imagegeneration multimodal-learning attention-is-all-you-need

Created 2023-07-14

73 commits to main branch, last one about a year ago

Awesome-Multimodal-LLM HenryHZY

16

357

mit

17

Research Trends in LLM-guided Multimodal Learning.

llm multimodal instruction-tuning in-context-learning multimodal-learning large-language-models parameter-efficient-tuning parameter-efficient-learning multimodal-large-language-models

Created 2023-05-29

16 commits to main branch, last one about a year ago

LViT HUANGLIZI

32

333

mit

2

[IEEE Transactions on Medical Imaging/TMI] This repo is the official implementation of "LViT: Language meets Vision Transformer in Medical Image Segmentation"

pytorch segmentation vision-language multimodal-learning medical-image-analysis

Created 2022-03-10

65 commits to main branch, last one 20 days ago