Trending repositories for language Python
This is a PyTorch-based reimplementation of CrossFlow, as proposed in 'Flowing from Words to Pixels: A Framework for Cross-Modality Evolution'
Official implementation for paper - LeviTor: 3D Trajectory Oriented Image-to-Video Synthesis
Code release of our paper "DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation".
uses gpt-4o and gpt-4-mini to write books on topics while researching with perplexity API
Docker environment to quickly try out "Genesis", a universal physics engine.
The official repo for "FATE: Full-head Gaussian Avatar with Textural Editing from Monocular Video"
🪄 XSSDynaGen is a tool designed to analyze URLs with parameters, identify the characters allowed by the server, and generate advanced XSS payloads based on the analysis results.
Convert PDF to fixed-layout EPUB, conserving the table of contents, inner cross-references and hyperlinks.
A generative world for general-purpose robotics & embodied AI learning.
Python tool for converting files and office documents to Markdown.
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
🍏 + 🎯 + 🐍 = Everything you need to query Apple's FindMy network!
🚀 Level up your GitHub profile readme with customizable cards including LOC statistics!
Python APIs for web automation, testing, and bypassing bot-detection.
PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/Docker
Collection of awesome LLM apps with RAG using OpenAI, Anthropic, Gemini and opensource models.
From RAG chatbots to code assistants to complex agentic pipelines and beyond, build LLM systems that run better, faster, and cheaper with tracing, evaluations, and dashboards.
Langflow is a low-code app builder for RAG and multi-agent AI applications. It’s Python-based and agnostic to any model, API, or database.
Replace 'hub' with 'ingest' in any github url to get a prompt-friendly (con)text to paste into any LLM
MarS: a Financial Market Simulation Engine Powered by Generative Foundation Model
No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documents
📺IPTV电视直播源更新工具🚀:✨央视、📡卫视、☘️广东及各省份地方台、🌊港·澳·台、🎬电影、🎥咪咕、🏀体育、🪁动画、🎮游戏、🎵音乐、🏛经典剧场;支持IPv4/IPv6;支持自定义增加频道;支持组播源、酒店源、订阅源、关键字搜索;每天自动更新两次,结果可用于TVBox等播放软件;支持工作流、Docker(amd64/arm64/arm v7)、命令行、GUI运行方式 | IP...
Official implementation of the paper "Generative Inbetweening through Frame-wise Conditions-Driven Video Generation"
🍏 + 🎯 + 🐍 = Everything you need to query Apple's FindMy network!
Statistics of your activities on GitHub in 2024. 统计2024年你在GitHub上的活动.
Docker environment to quickly try out "Genesis", a universal physics engine.
A seamless solution for using FastAPI's dependency injection system outside of route handlers, enabling painless reuse of dependencies in CLI tools, background tasks, and other non-HTTP contexts.
A simple, all-in-one tool for deploying on-demand WireGuard VPN servers on popular VPS providers—no ongoing subscriptions, effortless management, and automatic cleanup when you’re done.
🧠 Model-Driven test data generation platform enabling developers to create realistic, scalable, and privacy-compliant test data. Features model-driven data generation, GDPR compliance, and seamless P...
Official repo and evaluation implementation of VSI-Bench
A framework to enable autonomous android and computer use using any LLM (local or remote)
The official implementation of paper "BrushEdit: All-In-One Image Inpainting and Editing"
The official implementation of paper "ColorFlow: Retrieval-Augmented Image Sequence Colorization"
Video Processing Service is an automated video processing service that supports extracting audio from videos, generating subtitles, and embedding subtitles into the video.
A framework to enable autonomous android and computer use using any LLM (local or remote)
Code for MegaSynth: Scaling Up 3D Scene Reconstruction with Synthesized Data
**DynamiX** is an automation tool for dynamically managing Plex collections. It pins and unpins library collections based on configurable time blocks, ensuring fresh and relevant content is featured. ...
This is a PyTorch-based reimplementation of CrossFlow, as proposed in 'Flowing from Words to Pixels: A Framework for Cross-Modality Evolution'
TTSAudioNormalizer is a specialized tool for TTS data production, featuring descriptive statistical analysis of audio loudness and loudness normalization operations.
Python tool for converting files and office documents to Markdown.
A generative world for general-purpose robotics & embodied AI learning.
PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/Docker
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
🚀 Level up your GitHub profile readme with customizable cards including LOC statistics!
Python APIs for web automation, testing, and bypassing bot-detection.
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Collection of awesome LLM apps with RAG using OpenAI, Anthropic, Gemini and opensource models.
Langflow is a low-code app builder for RAG and multi-agent AI applications. It’s Python-based and agnostic to any model, API, or database.
Code to accompany "A Method for Animating Children's Drawings of the Human Figure"
🍏 + 🎯 + 🐍 = Everything you need to query Apple's FindMy network!
From RAG chatbots to code assistants to complex agentic pipelines and beyond, build LLM systems that run better, faster, and cheaper with tracing, evaluations, and dashboards.
📺IPTV电视直播源更新工具🚀:✨央视、📡卫视、☘️广东及各省份地方台、🌊港·澳·台、🎬电影、🎥咪咕、🏀体育、🪁动画、🎮游戏、🎵音乐、🏛经典剧场;支持IPv4/IPv6;支持自定义增加频道;支持组播源、酒店源、订阅源、关键字搜索;每天自动更新两次,结果可用于TVBox等播放软件;支持工作流、Docker(amd64/arm64/arm v7)、命令行、GUI运行方式 | IP...
A generative world for general-purpose robotics & embodied AI learning.
FastVideo is an open-source framework for accelerating large video diffusion model.
The official implementation of paper "ColorFlow: Retrieval-Augmented Image Sequence Colorization"
Zero-Shot Monocular Depth Completion with Guided Diffusion
**DynamiX** is an automation tool for dynamically managing Plex collections. It pins and unpins library collections based on configurable time blocks, ensuring fresh and relevant content is featured. ...
A critical vulnerability, CVE-2024-53677, has been identified in the popular Apache Struts framework, potentially allowing attackers to execute arbitrary code remotely. This vulnerability arises from ...
Code for FreeScale, a tuning-free method for higher-resolution visual generation
[AAAI2025] Predicting the Original Appearance of Damaged Historical Documents
Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation".
Replace 'hub' with 'ingest' in any github url to get a prompt-friendly (con)text to paste into any LLM
A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.
We present StableAnimator, the first end-to-end ID-preserving video diffusion framework, which synthesizes high-quality videos without any post-processing, conditioned on a reference image and a seque...
A lightweight task engine for building stateful AI agents that prioritizes simplicity and flexibility.
Large Concept Models: Language modeling in a sentence representation space
[arXiv 2024] Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis
A chatbot/GraphRAG framework that creates multi-llm-agents from social platform user comments and let them debate on specific topics.
Modern YouTube downloader with a clean PyQt6 interface. Download videos in any quality, extract audio, fetch subtitles (including auto-generated), and view video metadata. Built with yt-dlp for reliab...
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
Python tool for converting files and office documents to Markdown.
A generative world for general-purpose robotics & embodied AI learning.
PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/Docker
Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)
Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"
HunyuanVideo: A Systematic Framework For Large Video Generation Model
Langflow is a low-code app builder for RAG and multi-agent AI applications. It’s Python-based and agnostic to any model, API, or database.
Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation".
stock股票.获取股票数据,计算股票指标,识别股票形态,综合选股,选股策略,股票验证回测,股票自动交易,支持PC及移动设备。
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.
📺IPTV电视直播源更新工具🚀:✨央视、📡卫视、☘️广东及各省份地方台、🌊港·澳·台、🎬电影、🎥咪咕、🏀体育、🪁动画、🎮游戏、🎵音乐、🏛经典剧场;支持IPv4/IPv6;支持自定义增加频道;支持组播源、酒店源、订阅源、关键字搜索;每天自动更新两次,结果可用于TVBox等播放软件;支持工作流、Docker(amd64/arm64/arm v7)、命令行、GUI运行方式 | IP...
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
🚀 Level up your GitHub profile readme with customizable cards including LOC statistics!
An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.
Large Concept Models: Language modeling in a sentence representation space
[arXiv 2024] Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis
Replace 'hub' with 'ingest' in any github url to get a prompt-friendly (con)text to paste into any LLM
MarS: a Financial Market Simulation Engine Powered by Generative Foundation Model
Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation".
Kickstart your LLMOps initiative with a flexible, robust, and productive Python package.
Repository for ShowUI: One Vision-Language-Action Model for GUI Visual Agent
[ARXIV'24] SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints
Learning Flow Fields in Attention for Controllable Person Image Generation
[768 Resolution] [Any "SDXL" Model] [Various Conditions] [Arbitrary Views] Official impl. of "MV-Adapter: Multi-view Consistent Image Generation Made Easy"
A generative world for general-purpose robotics & embodied AI learning.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Auto_Jobs_Applier_AI_Agent aims to easy job hunt process by automating the job application process. Utilizing artificial intelligence, it enables users to apply for multiple jobs in an automated and p...
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.
Devika is an Agentic AI Software Engineer that can understand high-level human instructions, break them down into steps, research relevant information, and write code to achieve the given objective. D...
Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.
OCR, layout analysis, reading order, table recognition in 90+ languages
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
real time face swap and one-click video deepfake with only a single image
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
An opinionated list of awesome Python frameworks, libraries, software and resources.
Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
Python tool for converting files and office documents to Markdown.
Langflow is a low-code app builder for RAG and multi-agent AI applications. It’s Python-based and agnostic to any model, API, or database.
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
Auto_Jobs_Applier_AI_Agent aims to easy job hunt process by automating the job application process. Utilizing artificial intelligence, it enables users to apply for multiple jobs in an automated and p...
A collection of learning resources for curious software engineers
An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.
利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.
Collection of awesome LLM apps with RAG using OpenAI, Anthropic, Gemini and opensource models.
The #1 open-source voice interface for desktop, mobile, and ESP32 chips.
Start building LLM-empowered multi-agent applications in an easier way.
Official implementation of OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on
o1-engineer is a command-line tool designed to assist developers in managing and interacting with their projects efficiently. Leveraging the power of OpenAI's API, this tool provides functionalities s...
Code and dataset for photorealistic Codec Avatars driven from audio
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.