Statistics for topic computer-vision
RepositoryStats tracks 595,858 Github repositories, of these 3,130 are tagged with the computer-vision topic. The most common primary language for repositories using this topic is Python (1,856). Other languages include: Jupyter Notebook (448), C++ (197), JavaScript (66), C# (32), TypeScript (28), C (28), HTML (28), MATLAB (27), Java (24)
Stargazers over time for topic computer-vision
Most starred repositories for topic computer-vision (view more)
Trending repositories for topic computer-vision (view more)
500 AI Machine learning Deep learning Computer vision NLP Projects with code
[arXiv 2024] Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis
build ai agents that have the full context, open source, runs locally, developer friendly. 24/7 screen, mic, keyboard recording and control
[arXiv 2024] Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis
The world's 1st free and open source palm recognition SDK for Windows and Linux (Palm detection, ROI extraction, Template extraction, Template mathcing)
Evolve your OpenCV Image Processing filters using Cartesian Genetic Programming
The source code of IEEE TPAMI 2025 "Hyper-YOLO: When Visual Object Detection Meets Hypergraph Computation".
A summary of open-source deep learning-based infrared and visible image fusion and some vision algorithms. 红外与可见光图像融合的开源代码
[arXiv 2024] Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis
500 AI Machine learning Deep learning Computer vision NLP Projects with code
build ai agents that have the full context, open source, runs locally, developer friendly. 24/7 screen, mic, keyboard recording and control
[arXiv 2024] Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis
The source code of IEEE TPAMI 2025 "Hyper-YOLO: When Visual Object Detection Meets Hypergraph Computation".
The world's 1st free and open source palm recognition SDK for Windows and Linux (Palm detection, ROI extraction, Template extraction, Template mathcing)
Official Implementation of the paper: "A Distractor-Aware Memory for Visual Object Tracking with SAM2"
The source code of IEEE TPAMI 2025 "Hyper-YOLO: When Visual Object Detection Meets Hypergraph Computation".
[arXiv 2024] Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis
Material Anything: Generating Materials for Any 3D Object via Diffusion
The world's 1st free and open source palm recognition SDK for Windows and Linux (Palm detection, ROI extraction, Template extraction, Template mathcing)
Official Implementation of the paper: "A Distractor-Aware Memory for Visual Object Tracking with SAM2"
build ai agents that have the full context, open source, runs locally, developer friendly. 24/7 screen, mic, keyboard recording and control
Visualize streams of multimodal data. Free, fast, easy to use, and simple to integrate. Built in Rust.
[arXiv 2024] Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis
Explore a collection of resources and projects in Computer Science, covering algorithms, data structures, programming languages, and emerging technologies. Ideal for learners and enthusiasts looking t...
[TIP2024] MWFormer: Multi-Weather Image Restoration Using Degradation-Aware Transformers
Material Anything: Generating Materials for Any 3D Object via Diffusion
End-to-End SLAM with camera calibration, monocular prior integration and dense Rendering
build ai agents that have the full context, open source, runs locally, developer friendly. 24/7 screen, mic, keyboard recording and control
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
One-step image-to-image with Stable Diffusion turbo: sketch2image, day2night, and more
build ai agents that have the full context, open source, runs locally, developer friendly. 24/7 screen, mic, keyboard recording and control
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
Desktop app for automatically translating comics - BDs, Manga, Manhwa, Fumetti and more in a variety of formats (Image, Pdf, Epub, cbr, cbz, etc) and in multiple languages.
Superfast AI decision making and intelligent processing of multi-modal data.
Official implementation of Posterior-Mean Rectified Flow: Towards Minimum MSE Photo-Realistic Image Restoration
[CVPR 2024 Highlight] "MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis" (Official Implementation)