105 results found Sort:

974
9.9k
bsd-3-clause
98
LAVIS - A One-stop Library for Language-Vision Intelligence
Created 2022-08-24
492 commits to main branch, last one 2 days ago
(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.
Created 2020-10-13
610 commits to 2024-Version-2.0 branch, last one 14 days ago
155
1.7k
mit
42
Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch
Created 2023-10-18
172 commits to main branch, last one 4 months ago
261
1.7k
apache-2.0
27
FinRobot: An Open-Source AI Agent Platform for Financial Analysis using LLMs 🚀 🚀 🚀
Created 2024-02-27
269 commits to master branch, last one 3 days ago
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
Created 2022-09-28
62 commits to main branch, last one about a month ago
255
1.5k
apache-2.0
16
[ICLR 2024] Official implementation of " 🦙 Time-LLM: Time Series Forecasting by Reprogramming Large Language Models"
Created 2024-01-20
39 commits to main branch, last one 18 days ago
190
1.3k
apache-2.0
25
A flexible package for multimodal-deep-learning to combine tabular data with text and images using Wide and Deep models in Pytorch
Created 2017-10-21
940 commits to master branch, last one 14 days ago
收集 CVPR 最新的成果,包括论文、代码和demo视频等,欢迎大家推荐!Collect the latest CVPR (Conference on Computer Vision and Pattern Recognition) results, including papers, code, and demo videos, etc., and welcome recommendations...
Created 2021-03-13
19 commits to main branch, last one 6 months ago
Recent Advances in Vision and Language PreTrained Models (VL-PTMs)
Created 2020-03-25
38 commits to master branch, last one 2 years ago
This repository contains various models targetting multimodal representation learning, multimodal fusion for downstream tasks such as multimodal sentiment analysis.
Created 2021-08-28
95 commits to main branch, last one 2 years ago
Official implementation for "Blended Latent Diffusion" [SIGGRAPH 2023]
Created 2022-06-06
10 commits to master branch, last one 5 months ago
A collection of resources on applications of multi-modal learning in medical imaging.
Created 2022-07-13
151 commits to main branch, last one 9 days ago
A collection of parameter-efficient transfer learning papers focusing on computer vision and multimodal domains.
Created 2022-12-22
66 commits to main branch, last one about a month ago
25
359
apache-2.0
4
This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI"
Created 2023-11-23
127 commits to main branch, last one 16 hours ago
51
335
bsd-3-clause
12
Reference mapping for single-cell genomics
Created 2019-08-12
1,175 commits to master branch, last one 5 months ago
Towards Generalist Biomedical AI
Created 2023-07-31
118 commits to main branch, last one 9 months ago
A Survey on multimodal learning research.
Created 2021-09-20
79 commits to main branch, last one about a year ago
Multimodal Sarcasm Detection Dataset
Created 2019-02-20
82 commits to master branch, last one 3 months ago
Deep learning based content moderation from text, audio, video & image input modalities.
Created 2022-09-22
46 commits to main branch, last one 6 months ago
15
298
unknown
8
CVPR'24, Official Codebase of our Paper: "Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Creative Humor Generation".
Created 2023-12-01
33 commits to main branch, last one 7 months ago
Recent Advances in Vision and Language Pre-training (VLP)
Created 2021-09-14
56 commits to main branch, last one about a year ago
收集 ECCV 最新的成果,包括论文、代码和demo视频等,欢迎大家推荐!
Created 2022-07-04
48 commits to main branch, last one 2 years ago
List of academic resources on Multimodal ML for Music
Created 2022-12-29
11 commits to main branch, last one about a year ago
27
267
apache-2.0
3
Implementation of CVPR 2023 paper "Prompting Large Language Models with Answer Heuristics for Knowledge-based Visual Question Answering".
Created 2023-01-09
63 commits to main branch, last one about a year ago
[CVPR 2024] Official code for "Text-Driven Image Editing via Learnable Regions"
Created 2023-11-28
130 commits to main branch, last one about a month ago
TensorFlow implementation of "Multimodal Speech Emotion Recognition using Audio and Text," IEEE SLT-18
Created 2019-01-13
49 commits to master branch, last one 8 months ago
A comprehensive reading list for Emotion Recognition in Conversations
Created 2020-07-05
44 commits to master branch, last one 9 months ago
23
237
bsd-2-clause
5
Code and Pretrained Models for ICLR 2023 Paper "Contrastive Audio-Visual Masked Autoencoder".
Created 2023-03-20
44 commits to master branch, last one about a year ago