105 results found Sort:
- Filter by Primary Language:
- Python (63)
- Jupyter Notebook (15)
- C++ (2)
- OpenEdge ABL (2)
- JavaScript (1)
- HTML (1)
- TeX (1)
- +
LAVIS - A One-stop Library for Language-Vision Intelligence
Created
2022-08-24
492 commits to main branch, last one 2 days ago
(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.
Created
2020-10-13
610 commits to 2024-Version-2.0 branch, last one 14 days ago
Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch
Created
2023-10-18
172 commits to main branch, last one 4 months ago
FinRobot: An Open-Source AI Agent Platform for Financial Analysis using LLMs 🚀 🚀 🚀
Created
2024-02-27
269 commits to master branch, last one 3 days ago
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
ocr
document
documentai
multimodal
end-to-end-ocr
text-detection
computer-vision
vision-language
text-recognition
document-analysis
document-recognition
scene-text-detection
document-intelligence
vision-language-model
document-understanding
scene-text-recognition
artificial-intelligence
multimodal-deep-learning
vision-language-transformer
scene-text-detection-recognition
Created
2022-09-28
62 commits to main branch, last one about a month ago
[ICLR 2024] Official implementation of " 🦙 Time-LLM: Time Series Forecasting by Reprogramming Large Language Models"
Created
2024-01-20
39 commits to main branch, last one 18 days ago
A flexible package for multimodal-deep-learning to combine tabular data with text and images using Wide and Deep models in Pytorch
Created
2017-10-21
940 commits to master branch, last one 14 days ago
收集 CVPR 最新的成果,包括论文、代码和demo视频等,欢迎大家推荐!Collect the latest CVPR (Conference on Computer Vision and Pattern Recognition) results, including papers, code, and demo videos, etc., and welcome recommendations...
Created
2021-03-13
19 commits to main branch, last one 6 months ago
Recent Advances in Vision and Language PreTrained Models (VL-PTMs)
Created
2020-03-25
38 commits to master branch, last one 2 years ago
awesome grounding: A curated list of research papers in visual grounding
Created
2018-09-03
97 commits to master branch, last one about a year ago
This repository contains various models targetting multimodal representation learning, multimodal fusion for downstream tasks such as multimodal sentiment analysis.
Created
2021-08-28
95 commits to main branch, last one 2 years ago
Official implementation for "Blended Latent Diffusion" [SIGGRAPH 2023]
Created
2022-06-06
10 commits to master branch, last one 5 months ago
A collection of resources on applications of multi-modal learning in medical imaging.
Created
2022-07-13
151 commits to main branch, last one 9 days ago
A collection of parameter-efficient transfer learning papers focusing on computer vision and multimodal domains.
Created
2022-12-22
66 commits to main branch, last one about a month ago
This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI"
Created
2023-11-23
127 commits to main branch, last one 16 hours ago
Reference mapping for single-cell genomics
Created
2019-08-12
1,175 commits to master branch, last one 5 months ago
Towards Generalist Biomedical AI
Created
2023-07-31
118 commits to main branch, last one 9 months ago
A Survey on multimodal learning research.
Created
2021-09-20
79 commits to main branch, last one about a year ago
Multimodal Sarcasm Detection Dataset
Created
2019-02-20
82 commits to master branch, last one 3 months ago
Deep learning based content moderation from text, audio, video & image input modalities.
Created
2022-09-22
46 commits to main branch, last one 6 months ago
25
311
unknown
10
Paper List of Pre-trained Foundation Recommender Models
llm
chatgpt
gpt4rec
llm4rec
chatgpt3
multimodal
chatgpt4rec
pre-training
transferable
language-model
foundation-model
transfer-learning
llm-recommendation
recommender-system
large-language-model
recommendation-system
multimodal-deep-learning
multimodalrecommendation
cross-domainrecommendation
cross-domain-recommendation
Created
2023-06-25
197 commits to main branch, last one 3 months ago
CVPR'24, Official Codebase of our Paper: "Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Creative Humor Generation".
Created
2023-12-01
33 commits to main branch, last one 7 months ago
Recent Advances in Vision and Language Pre-training (VLP)
Created
2021-09-14
56 commits to main branch, last one about a year ago
收集 ECCV 最新的成果,包括论文、代码和demo视频等,欢迎大家推荐!
Created
2022-07-04
48 commits to main branch, last one 2 years ago
List of academic resources on Multimodal ML for Music
Created
2022-12-29
11 commits to main branch, last one about a year ago
Implementation of CVPR 2023 paper "Prompting Large Language Models with Answer Heuristics for Knowledge-based Visual Question Answering".
Created
2023-01-09
63 commits to main branch, last one about a year ago
[CVPR 2024] Official code for "Text-Driven Image Editing via Learnable Regions"
Created
2023-11-28
130 commits to main branch, last one about a month ago
TensorFlow implementation of "Multimodal Speech Emotion Recognition using Audio and Text," IEEE SLT-18
Created
2019-01-13
49 commits to master branch, last one 8 months ago
A comprehensive reading list for Emotion Recognition in Conversations
Created
2020-07-05
44 commits to master branch, last one 9 months ago
Code and Pretrained Models for ICLR 2023 Paper "Contrastive Audio-Visual Masked Autoencoder".
Created
2023-03-20
44 commits to master branch, last one about a year ago