258 results found Sort:

pix2tex: Using a ViT to convert images of equations into LaTeX code.
Created 2020-12-11
323 commits to main branch, last one 15 days ago
This repository contains demos I made with the Transformers library by HuggingFace.
Created 2020-08-31
431 commits to master branch, last one 2 months ago
427
6.4k
mit
121
[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simp...
Created 2024-04-01
45 commits to main branch, last one 15 days ago
473
5.9k
gpl-3.0
36
Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
Created 2024-06-04
122 commits to main branch, last one about a month ago
An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites
Created 2021-09-15
1,600 commits to main branch, last one 4 months ago
Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.
Created 2019-11-16
152 commits to master branch, last one 21 days ago
1.1k
3.5k
apache-2.0
30
OpenMMLab Pre-training Toolbox and Benchmark
Created 2020-07-09
974 commits to main branch, last one about a month ago
442
3.4k
apache-2.0
39
Scenic: A Jax Library for Computer Vision Research and Beyond
Created 2021-07-12
715 commits to main branch, last one 3 days ago
254
3.3k
apache-2.0
29
Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.
Created 2021-07-13
1,586 commits to main branch, last one 2 months ago
159
2.6k
apache-2.0
43
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
Created 2023-09-26
409 commits to main branch, last one 3 days ago
197
2.5k
apache-2.0
43
Efficient vision foundation models for high-resolution generation and perception.
Created 2023-04-05
134 commits to master branch, last one 12 days ago
169
2.4k
mit
30
EVA Series: Visual Representation Fantasies from BAAI
Created 2022-11-14
276 commits to master branch, last one 4 months ago
[CVPR 2021] Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize classifications by Transformer based networks.
Created 2020-11-23
86 commits to main branch, last one about a year ago
206
1.8k
apache-2.0
32
An all-in-one toolkit for computer vision
Created 2022-04-02
304 commits to master branch, last one 5 months ago
231
1.7k
mit
34
This is a collection of our NAS and Vision Transformer work.
Created 2020-10-12
222 commits to main branch, last one 11 months ago
137
1.4k
other
16
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
Created 2022-03-23
64 commits to main branch, last one about a year ago
188
1.4k
apache-2.0
21
The official repo for [NeurIPS'22] "ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation" and [TPAMI'23] "ViTPose++: Vision Transformer for Generic Body Pose Estimation"
Created 2022-04-27
19 commits to main branch, last one about a year ago
133
1.4k
other
18
VRT: A Video Restoration Transformer (official repository)
Created 2022-01-18
15 commits to main branch, last one 2 years ago
141
1.3k
apache-2.0
18
[ICLR 2023 Spotlight] Vision Transformer Adapter for Dense Predictions
Created 2022-05-16
62 commits to main branch, last one 11 months ago
77
1.2k
mit
12
Extract clean data from anywhere, powered by vision-language models ⚡
Created 2024-03-22
312 commits to main branch, last one about a month ago
ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
Created 2021-01-23
63 commits to main branch, last one 2 years ago
Awesome List of Attention Modules and Plug&Play Modules in Computer Vision
Created 2021-01-10
110 commits to main branch, last one about a year ago
64
984
apache-2.0
14
A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities
Created 2023-05-18
136 commits to main branch, last one 2 months ago
A curated list of foundation models for vision and language tasks
Created 2023-04-04
282 commits to main branch, last one 2 days ago
Explainability for Vision Transformers
Created 2020-12-29
19 commits to main branch, last one 3 years ago
SOTA Semantic Segmentation Models in PyTorch
Created 2021-06-02
98 commits to main branch, last one 9 months ago