76 results found Sort:

Reading list for research topics in multimodal machine learning
Created 2019-05-27
426 commits to master branch, last one 10 months ago
An open-source framework for training large multimodal models.
Created 2022-10-20
502 commits to main branch, last one 7 months ago
A curated list of Multimodal Related Research.
Created 2019-07-31
206 commits to master branch, last one 11 months ago
ICCV 2023 Papers: Discover cutting-edge research from ICCV 2023, the leading computer vision conference. Stay updated on the latest in computer vision and deep learning, with code included. ⭐ support ...
Created 2023-09-04
1,063 commits to main branch, last one 13 days ago
52
835
apache-2.0
12
[CVPR'24] UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
Created 2023-11-24
61 commits to main branch, last one 5 months ago
134
833
apache-2.0
25
A Comparative Framework for Multimodal Recommender Systems
Created 2018-07-17
1,356 commits to master branch, last one 7 days ago
116
794
mit
12
An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
Created 2021-04-13
29 commits to master branch, last one 2 years ago
Papers, code and datasets about deep learning and multi-modal learning for video analysis
Created 2017-06-14
91 commits to master branch, last one 2 years ago
This repository contains various models targetting multimodal representation learning, multimodal fusion for downstream tasks such as multimodal sentiment analysis.
Created 2021-08-28
95 commits to main branch, last one about a year ago
[CVPR2023 Highlight] GRES: Generalized Referring Expression Segmentation
Created 2023-03-11
14 commits to main branch, last one 9 months ago
Multimodal model for text and tabular data with HuggingFace transformers as building block for text data
Created 2020-08-20
125 commits to master branch, last one 2 months ago
98
555
apache-2.0
41
Multi-Modal learning toolkit based on PaddlePaddle and PyTorch, supporting multiple applications such as multi-modal classification, cross-modal retrieval and image caption.
Created 2022-01-05
198 commits to Pytorch branch, last one about a year ago
[ICCV 2023] MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions
Created 2023-08-01
11 commits to main branch, last one 6 months ago
34
446
other
14
Multi-modality pre-training
Created 2022-03-15
104 commits to main branch, last one 24 days ago
[NeurIPS 2021] Multiscale Benchmarks for Multimodal Representation Learning
Created 2021-03-05
1,258 commits to main branch, last one 4 months ago
64
430
mit
11
Knowledge-Aware machine LEarning (KALE): accessible machine learning from multiple sources for interdisciplinary research, part of the 🔥PyTorch ecosystem. ⭐ Star to support our work!
Created 2020-06-30
3,025 commits to main branch, last one 2 days ago
A curated list of awesome vision and language resources (still under construction... stay tuned!)
Created 2019-10-25
42 commits to master branch, last one 9 months ago
A collection of resources on applications of multi-modal learning in medical imaging.
Created 2022-07-13
124 commits to main branch, last one 26 days ago
17
336
mit
21
An open source implementation of "Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning", an all-new multi modal AI that uses just a decoder to generate both text and images
Created 2023-07-14
73 commits to main branch, last one 5 months ago
Research Trends in LLM-guided Multimodal Learning.
Created 2023-05-29
16 commits to main branch, last one 7 months ago
[ECCV'22] Official repository of paper titled "Class-agnostic Object Detection with Multi-modal Transformer".
Created 2021-11-16
33 commits to main branch, last one about a year ago
10
277
apache-2.0
13
[NeurIPS 2023] This repository includes the official implementation of our paper "An Inverse Scaling Law for CLIP Training"
Created 2023-04-26
78 commits to master branch, last one 7 months ago
19
273
apache-2.0
4
This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI"
Created 2023-11-23
76 commits to main branch, last one 20 hours ago
List of academic resources on Multimodal ML for Music
Created 2022-12-29
11 commits to main branch, last one about a year ago
[CVPR'24 Highlight] GPT4Point: A Unified Framework for Point-Language Understanding and Generation.
Created 2023-08-12
76 commits to main branch, last one about a month ago
24
260
mit
4
[IEEE Transactions on Medical Imaging/TMI] This repo is the official implementation of "LViT: Language meets Vision Transformer in Medical Image Segmentation"
Created 2022-03-10
64 commits to main branch, last one 7 months ago
ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processi...
Created 2023-08-01
560 commits to main branch, last one a day ago
21
190
other
17
[CVPR 2022] Show Me What and Tell Me How: Video Synthesis via Multimodal Conditioning
Created 2022-03-04
6 commits to main branch, last one about a year ago
A curated list of self-supervised multimodal learning resources.
Created 2023-03-31
13 commits to main branch, last one 10 months ago
8
161
apache-2.0
3
[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transformers
Created 2022-03-19
18 commits to main branch, last one 8 months ago