98 results found Sort:

903
9.1k
bsd-3-clause
96
LAVIS - A One-stop Library for Language-Vision Intelligence
Created 2022-08-24
490 commits to main branch, last one 6 months ago
(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.
Created 2020-10-13
588 commits to 2024-Version-2.0 branch, last one a day ago
133
1.4k
mit
39
Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch
Created 2023-10-18
170 commits to main branch, last one 7 days ago
186
1.3k
apache-2.0
25
A flexible package for multimodal-deep-learning to combine tabular data with text and images using Wide and Deep models in Pytorch
Created 2017-10-21
855 commits to master branch, last one 10 days ago
收集 CVPR 最新的成果,包括论文、代码和demo视频等,欢迎大家推荐!Collect the latest CVPR (Conference on Computer Vision and Pattern Recognition) results, including papers, code, and demo videos, etc., and welcome recommendations...
Created 2021-03-13
19 commits to main branch, last one 2 months ago
Recent Advances in Vision and Language PreTrained Models (VL-PTMs)
Created 2020-03-25
38 commits to master branch, last one 2 years ago
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
Created 2022-09-28
54 commits to main branch, last one 2 months ago
137
1.1k
apache-2.0
18
FinRobot: An Open-Source AI Agent Platform for Financial Applications using LLMs 🚀 🚀 🚀
Created 2024-02-27
172 commits to master branch, last one 2 days ago
164
980
apache-2.0
13
[ICLR 2024] Official implementation of " 🦙 Time-LLM: Time Series Forecasting by Reprogramming Large Language Models"
Created 2024-01-20
36 commits to main branch, last one 12 days ago
This repository contains various models targetting multimodal representation learning, multimodal fusion for downstream tasks such as multimodal sentiment analysis.
Created 2021-08-28
95 commits to main branch, last one 2 years ago
Official implementation for "Blended Latent Diffusion" [SIGGRAPH 2023]
Created 2022-06-06
10 commits to master branch, last one 22 days ago
A collection of resources on applications of multi-modal learning in medical imaging.
Created 2022-07-13
130 commits to main branch, last one 4 days ago
A collection of parameter-efficient transfer learning papers focusing on computer vision and multimodal domains.
Created 2022-12-22
63 commits to main branch, last one 14 days ago
50
319
bsd-3-clause
13
Reference mapping for single-cell genomics
Created 2019-08-12
1,175 commits to master branch, last one 22 days ago
Deep learning based content moderation from text, audio, video & image input modalities.
Created 2022-09-22
46 commits to main branch, last one about a month ago
收集 ECCV 最新的成果,包括论文、代码和demo视频等,欢迎大家推荐!
Created 2022-07-04
48 commits to main branch, last one about a year ago
A Survey on multimodal learning research.
Created 2021-09-20
79 commits to main branch, last one 10 months ago
Multimodal Sarcasm Detection Dataset
Created 2019-02-20
79 commits to master branch, last one 2 years ago
Recent Advances in Vision and Language Pre-training (VLP)
Created 2021-09-14
56 commits to main branch, last one about a year ago
18
282
apache-2.0
4
This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI"
Created 2023-11-23
77 commits to main branch, last one 2 days ago
List of academic resources on Multimodal ML for Music
Created 2022-12-29
11 commits to main branch, last one about a year ago
Towards Generalist Biomedical AI
Created 2023-07-31
118 commits to main branch, last one 4 months ago
27
260
apache-2.0
3
Implementation of CVPR 2023 paper "Prompting Large Language Models with Answer Heuristics for Knowledge-based Visual Question Answering".
Created 2023-01-09
63 commits to main branch, last one about a year ago
10
255
unknown
8
CVPR'24, Official Codebase of our Paper: "Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Creative Humor Generation".
Created 2023-12-01
33 commits to main branch, last one 2 months ago
TensorFlow implementation of "Multimodal Speech Emotion Recognition using Audio and Text," IEEE SLT-18
Created 2019-01-13
49 commits to master branch, last one 3 months ago
A comprehensive reading list for Emotion Recognition in Conversations
Created 2020-07-05
44 commits to master branch, last one 4 months ago
[CVPR'22 Best Paper Finalist] Official PyTorch implementation of the method presented in "Learning Multi-View Aggregation In the Wild for Large-Scale 3D Semantic Segmentation"
Created 2022-04-15
2,640 commits to release branch, last one 10 months ago
22
212
bsd-2-clause
5
Code and Pretrained Models for ICLR 2023 Paper "Contrastive Audio-Visual Masked Autoencoder".
Created 2023-03-20
44 commits to master branch, last one 7 months ago