114 results found Sort:

1.0k
10.4k
bsd-3-clause
96
LAVIS - A One-stop Library for Language-Vision Intelligence
Created 2022-08-24
492 commits to main branch, last one 4 months ago
492
3.0k
apache-2.0
48
FinRobot: An Open-Source AI Agent Platform for Financial Analysis using LLMs 🚀 🚀 🚀
Created 2024-02-27
269 commits to master branch, last one 4 months ago
(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.
Created 2020-10-13
614 commits to 2024-Version-2.0 branch, last one about a month ago
329
1.9k
apache-2.0
21
[ICLR 2024] Official implementation of " 🦙 Time-LLM: Time Series Forecasting by Reprogramming Large Language Models"
Created 2024-01-20
39 commits to main branch, last one 4 months ago
160
1.8k
mit
43
Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch
Created 2023-10-18
172 commits to main branch, last one 9 months ago
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
Created 2022-09-28
69 commits to main branch, last one 3 months ago
收集 CVPR 最新的成果,包括论文、代码和demo视频等,欢迎大家推荐!Collect the latest CVPR (Conference on Computer Vision and Pattern Recognition) results, including papers, code, and demo videos, etc., and welcome recommendations...
Created 2021-03-13
19 commits to main branch, last one 11 months ago
196
1.3k
apache-2.0
23
A flexible package for multimodal-deep-learning to combine tabular data with text and images using Wide and Deep models in Pytorch
Created 2017-10-21
945 commits to master branch, last one about a month ago
Recent Advances in Vision and Language PreTrained Models (VL-PTMs)
Created 2020-03-25
38 commits to master branch, last one 3 years ago
This repository contains various models targetting multimodal representation learning, multimodal fusion for downstream tasks such as multimodal sentiment analysis.
Created 2021-08-28
95 commits to main branch, last one 2 years ago
A collection of resources on applications of multi-modal learning in medical imaging.
Created 2022-07-13
158 commits to main branch, last one about a month ago
Official implementation for "Blended Latent Diffusion" [SIGGRAPH 2023]
Created 2022-06-06
10 commits to master branch, last one 9 months ago
33
405
apache-2.0
3
This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI"
Created 2023-11-23
147 commits to main branch, last one 20 days ago
A collection of parameter-efficient transfer learning papers focusing on computer vision and multimodal domains.
Created 2022-12-22
66 commits to main branch, last one 6 months ago
Towards Generalist Biomedical AI
Created 2023-07-31
118 commits to main branch, last one about a year ago
56
358
bsd-3-clause
11
Reference mapping for single-cell genomics
Created 2019-08-12
1,176 commits to master branch, last one about a month ago
Deep learning based content moderation from text, audio, video & image input modalities.
Created 2022-09-22
47 commits to main branch, last one 3 months ago
Multimodal Sarcasm Detection Dataset
Created 2019-02-20
82 commits to master branch, last one 7 months ago
12
322
unknown
7
Compose multimodal datasets 🎹
Created 2024-02-17
139 commits to main branch, last one 11 days ago
A Survey on multimodal learning research.
Created 2021-09-20
79 commits to main branch, last one about a year ago
15
308
unknown
8
CVPR'24, Official Codebase of our Paper: "Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Creative Humor Generation".
Created 2023-12-01
33 commits to main branch, last one 11 months ago
Recent Advances in Vision and Language Pre-training (VLP)
Created 2021-09-14
56 commits to main branch, last one about a year ago
List of academic resources on Multimodal ML for Music
Created 2022-12-29
11 commits to main branch, last one 2 years ago
收集 ECCV 最新的成果,包括论文、代码和demo视频等,欢迎大家推荐!
Created 2022-07-04
48 commits to main branch, last one 2 years ago
TensorFlow implementation of "Multimodal Speech Emotion Recognition using Audio and Text," IEEE SLT-18
Created 2019-01-13
49 commits to master branch, last one about a year ago
27
271
apache-2.0
2
Implementation of CVPR 2023 paper "Prompting Large Language Models with Answer Heuristics for Knowledge-based Visual Question Answering".
Created 2023-01-09
63 commits to main branch, last one about a year ago
A comprehensive reading list for Emotion Recognition in Conversations
Created 2020-07-05
44 commits to master branch, last one about a year ago
24
251
bsd-2-clause
5
Code and Pretrained Models for ICLR 2023 Paper "Contrastive Audio-Visual Masked Autoencoder".
Created 2023-03-20
44 commits to master branch, last one about a year ago