1 result found Sort:
VLM driven tool that processes surveillance videos, extracts frames, and generates insightful annotations using a fine-tuned Florence-2 Vision-Language Model. Includes a Gradio-based interface for que...
Created
2024-09-13
10 commits to main branch, last one 3 months ago