Search Results - RepositoryStats

meshed-memory-transformer aimagelab

135

530

bsd-3-clause

11

Meshed-Memory Transformer for Image Captioning. CVPR 2020

pytorch cvpr2020 transformer visual-semantic image-captioning captioning-images caption-generation

Created 2019-12-12

10 commits to master branch, last one 2 years ago

show-control-and-tell aimagelab

61

281

bsd-3-clause

9

Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions. CVPR 2019

pytorch cvpr2019 visual-semantic image-captioning captioning-images caption-generation

Created 2019-02-14

12 commits to master branch, last one 2 years ago

Scan2Cap daveredrum

16

103

other

7

[CVPR 2021] Scan2Cap: Context-aware Dense Captioning in RGB-D Scans

3d cvpr scans pytorch cvpr2021 point-cloud deep-learning computer-vision caption-generation natural-language-processing

Created 2020-12-07

18 commits to main branch, last one 2 years ago

ShapeGPT OpenShapeLab

1

95

unknown

17

ShapeGPT: 3D Shape Generation with A Unified Multi-modal Language Model, a unified and user-friendly shape-language model

gpt shape chatgpt unified multi-modal 3d-generation language-model caption-generation

Created 2023-11-30

22 commits to main branch, last one about a year ago

Vote2Cap-DETR ch3cook-fdu

8

89

mit

2

[CVPR 2023] Vote2Cap-DETR and [T-PAMI 2024] Vote2Cap-DETR++; A set-to-set perspective towards 3D Dense Captioning; State-of-the-Art 3D Dense Captioning methods

t-pami pytorch cvpr2023 3d-models 3d-detection deep-learning dense-captioning caption-generation vision-and-language multimodal-deep-learning

Created 2022-11-28

101 commits to master branch, last one 7 months ago

D3Net daveredrum

5

43

unknown

2

[ECCV2022] D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding

3d eccv eccv2022 point-cloud deep-learning computer-vision visual-grounding caption-generation semi-supervised-learning natural-language-processing

Created 2021-11-30

12 commits to main branch, last one 2 years ago

Image-Captioning tanishqgautam

15

41

unknown

3

Implemented 3 different architectures to tackle the Image Caption problem, i.e, Merged Encoder-Decoder - Bahdanau Attention - Transformers

nlp attention tensorflow transformers deep-learning computer-vision encoder-decoder image-captioning caption-generation

Created 2021-01-04

20 commits to main branch, last one 4 years ago