108 results found Sort:

979
10.1k
bsd-3-clause
97
LAVIS - A One-stop Library for Language-Vision Intelligence
Created 2022-08-24
492 commits to main branch, last one about a month ago
(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.
Created 2020-10-13
610 commits to 2024-Version-2.0 branch, last one about a month ago
285
1.8k
apache-2.0
33
FinRobot: An Open-Source AI Agent Platform for Financial Analysis using LLMs 🚀 🚀 🚀
Created 2024-02-27
269 commits to master branch, last one about a month ago
156
1.7k
mit
42
Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch
Created 2023-10-18
172 commits to main branch, last one 5 months ago
270
1.6k
apache-2.0
17
[ICLR 2024] Official implementation of " 🦙 Time-LLM: Time Series Forecasting by Reprogramming Large Language Models"
Created 2024-01-20
39 commits to main branch, last one about a month ago
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
Created 2022-09-28
64 commits to main branch, last one 5 days ago
191
1.3k
apache-2.0
25
A flexible package for multimodal-deep-learning to combine tabular data with text and images using Wide and Deep models in Pytorch
Created 2017-10-21
940 commits to master branch, last one about a month ago
收集 CVPR 最新的成果,包括论文、代码和demo视频等,欢迎大家推荐!Collect the latest CVPR (Conference on Computer Vision and Pattern Recognition) results, including papers, code, and demo videos, etc., and welcome recommendations...
Created 2021-03-13
19 commits to main branch, last one 8 months ago
Recent Advances in Vision and Language PreTrained Models (VL-PTMs)
Created 2020-03-25
38 commits to master branch, last one 3 years ago
This repository contains various models targetting multimodal representation learning, multimodal fusion for downstream tasks such as multimodal sentiment analysis.
Created 2021-08-28
95 commits to main branch, last one 2 years ago
A collection of resources on applications of multi-modal learning in medical imaging.
Created 2022-07-13
151 commits to main branch, last one about a month ago
Official implementation for "Blended Latent Diffusion" [SIGGRAPH 2023]
Created 2022-06-06
10 commits to master branch, last one 6 months ago
A collection of parameter-efficient transfer learning papers focusing on computer vision and multimodal domains.
Created 2022-12-22
66 commits to main branch, last one 2 months ago
29
370
apache-2.0
4
This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI"
Created 2023-11-23
131 commits to main branch, last one 11 days ago
52
343
bsd-3-clause
12
Reference mapping for single-cell genomics
Created 2019-08-12
1,175 commits to master branch, last one 6 months ago
Towards Generalist Biomedical AI
Created 2023-07-31
118 commits to main branch, last one 10 months ago
Deep learning based content moderation from text, audio, video & image input modalities.
Created 2022-09-22
47 commits to main branch, last one 5 days ago
Multimodal Sarcasm Detection Dataset
Created 2019-02-20
82 commits to master branch, last one 4 months ago
A Survey on multimodal learning research.
Created 2021-09-20
79 commits to main branch, last one about a year ago
15
301
unknown
8
CVPR'24, Official Codebase of our Paper: "Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Creative Humor Generation".
Created 2023-12-01
33 commits to main branch, last one 8 months ago
Recent Advances in Vision and Language Pre-training (VLP)
Created 2021-09-14
56 commits to main branch, last one about a year ago
收集 ECCV 最新的成果,包括论文、代码和demo视频等,欢迎大家推荐!
Created 2022-07-04
48 commits to main branch, last one 2 years ago
List of academic resources on Multimodal ML for Music
Created 2022-12-29
11 commits to main branch, last one about a year ago
[CVPR 2024] Official code for "Text-Driven Image Editing via Learnable Regions"
Created 2023-11-28
130 commits to main branch, last one 2 months ago
27
270
apache-2.0
3
Implementation of CVPR 2023 paper "Prompting Large Language Models with Answer Heuristics for Knowledge-based Visual Question Answering".
Created 2023-01-09
63 commits to main branch, last one about a year ago
TensorFlow implementation of "Multimodal Speech Emotion Recognition using Audio and Text," IEEE SLT-18
Created 2019-01-13
49 commits to master branch, last one 9 months ago
A comprehensive reading list for Emotion Recognition in Conversations
Created 2020-07-05
44 commits to master branch, last one 10 months ago
23
244
bsd-2-clause
5
Code and Pretrained Models for ICLR 2023 Paper "Contrastive Audio-Visual Masked Autoencoder".
Created 2023-03-20
44 commits to master branch, last one about a year ago