Statistics for topic crawler
RepositoryStats tracks 595,856 Github repositories, of these 573 are tagged with the crawler topic. The most common primary language for repositories using this topic is Python (273). Other languages include: Go (70), JavaScript (52), Java (28), PHP (28), TypeScript (27), C# (16), HTML (16), Rust (11)
Stargazers over time for topic crawler
Most starred repositories for topic crawler (view more)
Trending repositories for topic crawler (view more)
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。
⚡ 一款用于自动语音识别 (ASR)、翻译的高性能异步 API。不需要购买Whisper API,使用本地运行的Whisper模型进行推理,并支持多GPU并发,针对分布式部署进行设计。还内置了包括TikTok、抖音等社交媒体平台的爬虫,可实现来自多个社交平台的无缝媒体处理,为媒体内容数据自动化处理提供了强大且可扩展的解决方案。
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and o...
⚡ 一款用于自动语音识别 (ASR)、翻译的高性能异步 API。不需要购买Whisper API,使用本地运行的Whisper模型进行推理,并支持多GPU并发,针对分布式部署进行设计。还内置了包括TikTok、抖音等社交媒体平台的爬虫,可实现来自多个社交平台的无缝媒体处理,为媒体内容数据自动化处理提供了强大且可扩展的解决方案。
SiteOne Crawler is a cross-platform website crawler and analyzer for SEO, security, accessibility, and performance optimization—ideal for developers, DevOps, QA engineers, and consultants. Supports Wi...
⚡ A subdomain enumeration tool leveraging diverse techniques, designed for advanced pentesting operations
Collection of patches for puppeteer and playwright to avoid automation detection and leaks. Helps to avoid Cloudflare and DataDome CAPTCHA pages. Easy to patch/unpatch, can be enabled/disabled on dema...
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and o...
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。
⚡ 一款用于自动语音识别 (ASR)、翻译的高性能异步 API。不需要购买Whisper API,使用本地运行的Whisper模型进行推理,并支持多GPU并发,针对分布式部署进行设计。还内置了包括TikTok、抖音等社交媒体平台的爬虫,可实现来自多个社交平台的无缝媒体处理,为媒体内容数据自动化处理提供了强大且可扩展的解决方案。
A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。
⚡ 一款用于自动语音识别 (ASR)、翻译的高性能异步 API。不需要购买Whisper API,使用本地运行的Whisper模型进行推理,并支持多GPU并发,针对分布式部署进行设计。还内置了包括TikTok、抖音等社交媒体平台的爬虫,可实现来自多个社交平台的无缝媒体处理,为媒体内容数据自动化处理提供了强大且可扩展的解决方案。
SiteOne Crawler GUI is a cross-platform website crawler and analyzer for SEO, security, accessibility, and performance optimization—ideal for developers, DevOps, QA engineers, and consultants. Support...
⚡ A subdomain enumeration tool leveraging diverse techniques, designed for advanced pentesting operations
A web crawler for downloading WEBM and MP4 video formats from Pornhub. This project is designed to scrape and download available video content for educational or research purposes. Note that usage mus...
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and o...
A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。
⚡ A subdomain enumeration tool leveraging diverse techniques, designed for advanced pentesting operations
SiteOne Crawler GUI is a cross-platform website crawler and analyzer for SEO, security, accessibility, and performance optimization—ideal for developers, DevOps, QA engineers, and consultants. Support...
一个网络安全法律法规、安全政策、国家标准、行业标准知识库。A knowledge base of cybersecurity laws and regulations, security policies, national standards, and industry standards.
⚡ 一款用于自动语音识别 (ASR)、翻译的高性能异步 API。不需要购买Whisper API,使用本地运行的Whisper模型进行推理,并支持多GPU并发,针对分布式部署进行设计。还内置了包括TikTok、抖音等社交媒体平台的爬虫,可实现来自多个社交平台的无缝媒体处理,为媒体内容数据自动化处理提供了强大且可扩展的解决方案。
A web crawler for downloading WEBM and MP4 video formats from Pornhub. This project is designed to scrape and download available video content for educational or research purposes. Note that usage mus...
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works...
Undetectable, Lightning-Fast, and Adaptive Web Scraping for Python
Collection of patches for puppeteer and playwright to avoid automation detection and leaks. Helps to avoid Cloudflare and DataDome CAPTCHA pages. Easy to patch/unpatch, can be enabled/disabled on dema...
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and o...
👾 Fast and simple video download library and CLI tool written in Go
Collection of patches for puppeteer and playwright to avoid automation detection and leaks. Helps to avoid Cloudflare and DataDome CAPTCHA pages. Easy to patch/unpatch, can be enabled/disabled on dema...
📰 Newspaper4k a fork of the beloved Newspaper3k. Extraction of articles, titles, and metadata from news websites.
Sasori is a dynamic web crawler powered by Puppeteer, designed for lightning-fast endpoint discovery.
SiteOne Crawler is a cross-platform website crawler and analyzer for SEO, security, accessibility, and performance optimization—ideal for developers, DevOps, QA engineers, and consultants. Supports Wi...