Search Results - RepositoryStats

mmdetection open-mmlab

9.5k

29.9k

apache-2.0

370

OpenMMLab Detection Toolbox and Benchmark

Created 2018-08-22

2,706 commits to main branch, last one 10 months ago

LaTeX-OCR lukas-blecher

1.0k

13.1k

mit

73

pix2tex: Using a ViT to convert images of equations into LaTeX code.

ocr vit latex python dataset im2text pytorch im2latex math-ocr im2markup latex-ocr image2text transformer deep-learning image-processing machine-learning vision-transformer

Created 2020-12-11

323 commits to main branch, last one 15 days ago

Transformers-Tutorials NielsRogge

1.5k

9.7k

mit

142

This repository contains demos I made with the Transformers library by HuggingFace.

bert gpt-2 pytorch layoutlm transformers vision-transformer

Created 2020-08-31

431 commits to master branch, last one 2 months ago

VAR FoundationVision

427

6.4k

mit

121

[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simp...

gpt gpt-2 neurips transformers generative-ai diffusion-models generative-model image-generation vision-transformer auto-regressive-model autoregressive-models large-language-models

Created 2024-04-01

45 commits to main branch, last one 15 days ago

omniparse adithya-s-k

473

5.9k

gpl-3.0

36

Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks

ocr omniparser web-crawler whisper-api parse-server ingestion-api parser-library vision-transformer

Created 2024-06-04

122 commits to main branch, last one about a month ago

Awesome-Transformer-Attention cmhungsteve

490

4.7k

unknown

130

An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites

vit detr papers transformer awesome-list transformers deep-learning self-attention transformer-cv computer-vision transformer-models vision-transformer visual-transformer attention-mechanism transformer-awesome transformer-with-cv attention-mechanisms transformer-architecture

Created 2021-09-15

1,600 commits to main branch, last one 4 months ago

SwinIR JingyunLiang

558

4.5k

apache-2.0

53

SwinIR: Image Restoration Using Swin Transformer (official repository)

image-sr denoising deblocking restoration transformer decompression image-denoising image-deblocking low-level-vision super-resolution image-restoration vision-transformer image-super-resolution compression-artifact-reduction real-world-image-super-resolution lightweight-image-super-resolution

Created 2021-08-16

66 commits to main branch, last one 2 years ago

Efficient-AI-Backbones huawei-noah

709

4.1k

unknown

54

Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.

pytorch ghostnet imagenet tensorflow transformer model-compression pretrained-models vision-transformer efficient-inference convolutional-neural-networks

Created 2019-11-16

152 commits to master branch, last one 21 days ago

mmpretrain open-mmlab

1.1k

3.5k

apache-2.0

30

OpenMMLab Pre-training Toolbox and Benchmark

mae beit clip moco resnet pytorch convnext mobilenet multimodal deep-learning swin-transformer pretrained-models vision-transformer image-classification constrastive-learning masked-image-modeling self-supervised-learning

Created 2020-07-09

974 commits to main branch, last one about a month ago

scenic google-research

442

3.4k

apache-2.0

39

Scenic: A Jax Library for Computer Vision Research and Beyond

jax research attention transformers deep-learning computer-vision vision-transformer

Created 2021-07-12

715 commits to main branch, last one 3 days ago

towhee towhee-io

254

3.3k

apache-2.0

29

Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.

llm vit milvus towhee pipeline embeddings transformer feature-vector computer-vision image-retrieval image-processing machine-learning video-processing embedding-vectors unstructured-data feature-extraction vision-transformer convolutional-networks

Created 2021-07-13

1,586 commits to main branch, last one 2 months ago

InternLM-XComposer InternLM

159

2.6k

apache-2.0

43

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

gpt llm mllm gpt-4 chatgpt foundation multimodal language-model multi-modality instruction-tuning vision-transformer large-language-model supervised-finetuning vision-language-model visual-language-learning large-vision-language-model

Created 2023-09-26

409 commits to main branch, last one 3 days ago

efficientvit mit-han-lab

197

2.5k

apache-2.0

43

Efficient vision foundation models for high-resolution generation and perception.

imagenet efficientvit segmentation high-resolution segment-anything vision-transformer efficient-diffusion-model deep-compression-autoencoder

Created 2023-04-05

134 commits to master branch, last one 12 days ago

EVA baaivision

169

2.4k

mit

30

EVA Series: Visual Representation Fantasies from BAAI

foundation-models vision-transformer representation-learning

Created 2022-11-14

276 commits to master branch, last one 4 months ago

Transformer-Explainability hila-chefer

241

1.8k

mit

21

[CVPR 2021] Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize classifications by Transformer based networks.

vit bert cvpr2021 bert-model perturbation deep-learning explainability attention-matrix vision-transformer attention-visualization visualize-classifications transformer-interpretability

Created 2020-11-23

86 commits to main branch, last one about a year ago

EasyCV alibaba

206

1.8k

apache-2.0

32

An all-in-one toolkit for computer vision

pytorch transformers classification computer-vision object-detection vision-transformer self-supervised-learning

Created 2022-04-02

304 commits to master branch, last one 5 months ago

Cream microsoft

231

1.7k

mit

34

This is a collection of our NAS and Vision Transformer work.

nas rpe automl efficiency vit-compression vision-transformer knowledge-distillation

Created 2020-10-12

222 commits to main branch, last one 11 months ago

InternVideo OpenGVLab

91

1.5k

apache-2.0

27

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Created 2022-11-23

229 commits to main branch, last one 10 days ago

VideoMAE MCG-NJU

137

1.4k

other

16

[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training

mae pytorch transformer neurips-2022 video-analysis video-transformer action-recognition masked-autoencoder vision-transformer video-understanding self-supervised-learning video-representation-learning

Created 2022-03-23

64 commits to main branch, last one about a year ago

ViTPose ViTAE-Transformer

188

1.4k

apache-2.0

21

The official repo for [NeurIPS'22] "ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation" and [TPAMI'23] "ViTPose++: Vision Transformer for Generic Body Pose Estimation"

mae pytorch distillation deep-learning pose-estimation vision-transformer self-supervised-learning

Created 2022-04-27

19 commits to main branch, last one about a year ago

VRT JingyunLiang

133

1.4k

other

18

VRT: A Video Restoration Transformer (official repository)

sr video video-sr denoising deblurring restoration transformer video-denoising low-level-vision super-resolution video-deblurring video-restoration vision-transformer video-super-resolution

Created 2022-01-18

15 commits to main branch, last one 2 years ago

ViT-Adapter czczup

141

1.3k

apache-2.0

18

[ICLR 2023 Spotlight] Vision Transformer Adapter for Dense Predictions

adapter object-detection vision-transformer semantic-segmentation

Created 2022-05-16

62 commits to main branch, last one 11 months ago

thepipe emcf

77

1.2k

mit

12

Extract clean data from anywhere, powered by vision-language models ⚡

pdf web gpt-4 gpt-4o scrapers multimodal vision-transformer large-language-models

Created 2024-03-22

312 commits to main branch, last one about a month ago

T2T-ViT yitu-opensource

176

1.2k

other

18

ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet

vit t2t-transformer vision-transformer

Created 2021-01-23

63 commits to main branch, last one 2 years ago

awesome-attention-mechanism-in-cv pprp

166

1.1k

mit

16

Awesome List of Attention Modules and Plug&Play Modules in Computer Vision

plugandplay implementation attention-model computer-vision pytorch-attention vision-transformer attention-mechanisms

Created 2021-01-10

110 commits to main branch, last one about a year ago

VoxFormer NVlabs

89

1.1k

other

29

Official PyTorch implementation of VoxFormer [CVPR 2023 Highlight]

2d-to-3d 3d-perception deep-learning semantickitti computer-vision machine-learning voxel-proceessing autonomous-driving occupancy-grid-map vision-transformer autonomous-vehicles 3d-scene-understanding artificial-intelligence semantic-scene-completion

Created 2023-02-21

47 commits to main branch, last one about a year ago

ONE-PEACE OFA-Sys

64

984

apache-2.0

14

A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities

multimodal audio-language vision-language contrastive-loss foundation-models vision-transformer vision-and-language representation-learning

Created 2023-05-18

136 commits to main branch, last one 2 months ago