Statistics for language Python
RepositoryStats tracks 584,792 Github repositories, of these 116,437 are reported to use a primary language of Python.
Most starred repositories for language Python (view more)
Trending repositories for language Python (view more)
AnyModal is a Flexible Multimodal Language Model Framework for PyTorch
ASCII generator (image to text, image to image, video to video)
RAG that intelligently adapts to your use case, data, and queries
openai-captcha-detection 是一个使用 OpenAI 进行验证码识别的工具。目前验证码识别准确率100%,通过调用 OpenAI 的 API,这个项目可以实现对复杂验证码图片的文本识别,帮助开发者在验证码处理场景中进行自动化操作。
BiomedParse: A Foundation Model for Joint Segmentation, Detection, and Recognition of Biomedical Objects Across Nine Modalities
A collection of LogitsProcessors to customize and enhance LLM behavior for specific tasks.
This is a study aim to transfer the single concept by using DIT model self-attention capablity
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
LLM-powered multiagent persona simulation for imagination enhancement and business insights.
ASCII generator (image to text, image to image, video to video)
An AI memory layer with short- and long-term storage, semantic clustering, and optional memory decay for context-aware applications.
BiomedParse: A Foundation Model for Joint Segmentation, Detection, and Recognition of Biomedical Objects Across Nine Modalities
🎬 卡卡字幕助手 | VideoCaptioner - 基于 LLM 的智能字幕助手,无需GPU一键高质量字幕视频合成!支持生成、断句、优化、翻译全流程。让视频字幕制作简单高效!
The first AI agent that builds third-party integrations through reverse engineering platforms' internal APIs.
Tencent Hunyuan3D-1.0: A Unified Framework for Text-to-3D and Image-to-3D Generation
Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚
Document (PDF) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown
DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion
A flexible framework powered by ComfyUI for generating personalized Nobel Prize images.
Fast and accurate automatic speech recognition (ASR) for edge devices
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
real time face swap and one-click video deepfake with only a single image
An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.
Finetune Llama 3.2, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 80% less memory
利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.