5 results found Sort:

VisualGPT, CVPR 2022 Proceeding, GPT as a decoder for vision-language models
Created 2021-02-15
43 commits to main branch, last one about a year ago
CLIPxGPT Captioner is Image Captioning Model based on OpenAI's CLIP and GPT-2.
Created 2022-09-25
102 commits to main branch, last one 11 months ago
[CVPR23] A cascaded diffusion captioning model with a novel semantic-conditional diffusion process that upgrades conventional diffusion model with additional semantic prior.
Created 2022-12-05
8 commits to main branch, last one 5 months ago
An Image captioning web application combines the power of React.js for front-end, Flask and Node.js for back-end, utilizing the MERN stack. Users can upload images and instantly receive automatic capt...
Created 2023-02-02
16 commits to master branch, last one 11 months ago
A Python base cli tool for caption images with WD series, Joy-caption-pre-alpha,meta Llama 3.2 Vision Instruct and Qwen2 VL Instruct models.
Created 2024-09-01
23 commits to main branch, last one about a month ago