Search Results - RepositoryStats

99

1.1k

mit

29

awesome grounding: A curated list of research papers in visual grounding

arxiv paper papers grounding awesome-list paper-roadmap embodied-agent computer-vision image-grounding video-grounding phrase-grounding visual-grounding captioning-images captioning-videos language-grounding video-understanding multimodal-deep-learning natural-language-processing

Created 2018-09-03

97 commits to master branch, last one about a year ago

Robotic-grasping-papers rhett-chen

20

305

unknown

7

paper list of robotic grasping and some related works

grasp papers 6d-pose grasping robotics paper-list manipulation general-grasp semantic-grasp robotic-grasping visual-grounding task-oriented-grasp robotic-manipulation

Created 2022-03-29

82 commits to main branch, last one 4 months ago

awesome-described-object-detection Charles-Xie

22

257

unknown

9

A curated list of papers and resources related to Described Object Detection, Open-Vocabulary/Open-World Object Detection and Referring Expression Comprehension. Updated frequently and pull requests w...

awesome awesome-list visual-grounding open-vocabulary-detection open-world-object-detection referring-expression-comprehension

Created 2023-09-07

48 commits to main branch, last one 2 days ago

ScanRefer daveredrum

28

250

other

9

[ECCV 2020] ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language

3d eccv dataset pytorch point-cloud deep-learning computer-vision visual-grounding natural-language-processing

Created 2020-01-22

104 commits to master branch, last one 2 years ago

TubeDETR antoyang

9

177

apache-2.0

3

[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transformers

stvg vidstg hc-stvg visual-grounding multimodal-learning video-understanding vision-and-language spatio-temporal-video-grounding

Created 2022-03-19

18 commits to main branch, last one about a year ago

Pseudo-Q LeapLabTHU

10

147

apache-2.0

3

[CVPR 2022] Pseudo-Q: Generating Pseudo Language Queries for Visual Grounding

pytorch cvpr2022 deep-learning computer-vision visual-grounding vision-and-language multimodal-deep-learning

Created 2022-03-14

43 commits to main branch, last one 8 months ago

SeqTR seanzhuh

14

134

unknown

0

SeqTR: A Simple yet Universal Network for Visual Grounding

visual-grounding auto-regressive-models

Created 2022-03-30

27 commits to main branch, last one 4 months ago

Awesome-Visual-Grounding linhuixiao

13

121

apache-2.0

5

[TPAMI reviewing] Towards Visual Grounding: A Survey

survey awesome grounding visual-grounding

Created 2024-07-03

66 commits to master branch, last one a day ago

EDA yanmin-wu

4

114

other

2

[CVPR 2023] EDA: Explicit Text-Decoupling and Dense Alignment for 3D Visual Grounding

visual-grounding 3d-visual-grounding vision-and-language 3d-vision-and-language

Created 2022-09-30

4 commits to master branch, last one about a year ago

Awesome-3D-Vision-and-Language jianghaojun

5

97

mit

3

A collection of 3D vision and language (e.g., 3D Visual Grounding, 3D Question Answering and 3D Dense Caption) papers and datasets.

awesome point-cloud computer-vision 3d-deep-learning visual-grounding 3d-vision-and-language multimodal-deep-learning natural-language-processing

Created 2022-04-15

21 commits to main branch, last one 2 years ago

VLTVG yangli18

9

95

unknown

2

Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning, CVPR 2022

cross-modal vision-language visual-grounding visual-linguistic

Created 2022-04-29

5 commits to master branch, last one 2 years ago

VLM-Grounder OpenRobotLab

1

91

unknown

1

[CoRL 2024] VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding

vlm agent gpt-4o robotics vlm-grounder visual-grounding vision-and-language large-language-models 3d-scene-understanding

Created 2024-10-17

6 commits to main branch, last one 3 months ago

awesome-rvos JerryX1110

4

87

mit

6

Referring Video Object Segmentation / Multi-Object Tracking Repo

rvos text image video refer-vos linguistic multi-modal youtube-vos refering-seg segmentation visual-grounding refer-segmentation multimodal-deep-learning

Created 2021-12-11

58 commits to main branch, last one about a year ago

M3DRef-CLIP 3dlg-hcvc

4

82

mit

1

[ICCV 2023] Multi3DRefer: Grounding Text Description to Multiple 3D Objects

3d clip cuda pytorch transformer localization deep-learning computer-vision visual-grounding pytorch-lightning

Created 2023-06-01

42 commits to main branch, last one about a year ago

GeoText-1652 MultimodalGeo

4

80

unknown

1

An offical repo for ECCV 2024 Towards Natural Language-Guided Drones: GeoText-1652 Benchmark with Spatial Relation Matching

geotext university-1652 drone-navigation geo-localization visual-grounding vision-and-language natural-language-processing

Created 2024-07-12

53 commits to main branch, last one about a month ago

vRGV doc-doc

7

57

unknown

3

Visual Relation Grounding in Videos (ECCV'20, Spotlight)

hierarchical region-graph spatio-temporal visual-relation visual-grounding

Created 2019-11-28

97 commits to master branch, last one 2 years ago

3DVL_Codebase zlccccc

4

53

other

3

[CVPR2022 Oral] 3DJCG: A Unified Framework for Joint Dense Captioning and Visual Grounding on 3D Point Clouds

3d-vqa pytorch cvpr2022 3d-vision scanrefer deep-learning dense-captioning visual-grounding 3d-visual-grounding 3d-vision-and-language 3d-visual-question-answering

Created 2022-04-19

10 commits to main branch, last one 2 years ago