Statistics for topic web-scraping
RepositoryStats tracks 633,100 Github repositories, of these 255 are tagged with the web-scraping topic. The most common primary language for repositories using this topic is Python (118). Other languages include: JavaScript (25), Jupyter Notebook (17), TypeScript (15), Go (12), HTML (12)
Stargazers over time for topic web-scraping
Most starred repositories for topic web-scraping (view more)
Trending repositories for topic web-scraping (view more)
Official Firecrawl MCP Server - Adds powerful web scraping to Cursor, Claude and any other LLM clients.
The best and simplest free open source web page change detection, website watcher, restock monitor and notification service. Restock Monitor, change detection. Designed for simplicity - Simply monito...
Claim Free proxy list with United States IP addresses and use it for your projects.
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。
🔥Open Source No Code Web Data Extraction Platform. Turn Websites To APIs & Spreadsheets With No-Code Robots In Minutes🔥
Model Context Protocol server that integrates AgentQL's data extraction capabilities.
Claim Free proxy list with United States IP addresses and use it for your projects.
Model Context Protocol (MCP) Server for Graphlit Platform
Official Firecrawl MCP Server - Adds powerful web scraping to Cursor, Claude and any other LLM clients.
A Playwright-based Node.js tool that bypasses search engine anti-scraping mechanisms to execute Google searches. Local alternative to SERP APIs with MCP server integration.
Official Firecrawl MCP Server - Adds powerful web scraping to Cursor, Claude and any other LLM clients.
Claim Free proxy list with United States IP addresses and use it for your projects.
The best and simplest free open source web page change detection, website watcher, restock monitor and notification service. Restock Monitor, change detection. Designed for simplicity - Simply monito...
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。
Python APIs for web automation, testing, and bypassing bot-detection.
Model Context Protocol server that integrates AgentQL's data extraction capabilities.
Claim Free proxy list with United States IP addresses and use it for your projects.
Model Context Protocol (MCP) Server for Graphlit Platform
Official Firecrawl MCP Server - Adds powerful web scraping to Cursor, Claude and any other LLM clients.
A Playwright-based Node.js tool that bypasses search engine anti-scraping mechanisms to execute Google searches. Local alternative to SERP APIs with MCP server integration.
Official Firecrawl MCP Server - Adds powerful web scraping to Cursor, Claude and any other LLM clients.
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。
The best and simplest free open source web page change detection, website watcher, restock monitor and notification service. Restock Monitor, change detection. Designed for simplicity - Simply monito...
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and o...
Scrapy, a fast high-level web crawling & scraping framework for Python.
Model Context Protocol (MCP) Server for Graphlit Platform
Official Firecrawl MCP Server - Adds powerful web scraping to Cursor, Claude and any other LLM clients.
Claim Free proxy list with United States IP addresses and use it for your projects.
Website-downloader is a powerful and versatile Python script designed to download entire websites along with all their assets. This tool allows you to create a local copy of a website, including HTML ...
🕷️ An undetectable, powerful, flexible, high-performance Python library that makes Web Scraping easy again!
Official Firecrawl MCP Server - Adds powerful web scraping to Cursor, Claude and any other LLM clients.
Collection of patches for puppeteer and playwright to avoid automation detection and leaks. Helps to avoid Cloudflare and DataDome CAPTCHA pages. Easy to patch/unpatch, can be enabled/disabled on dema...
Claim Free proxy list with United States IP addresses and use it for your projects.
Undetected Python version of the Playwright testing and automation library.
🔥Open Source No Code Web Data Extraction Platform. Turn Websites To APIs & Spreadsheets With No-Code Robots In Minutes🔥
The best and simplest free open source web page change detection, website watcher, restock monitor and notification service. Restock Monitor, change detection. Designed for simplicity - Simply monito...
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。
Python APIs for web automation, testing, and bypassing bot-detection.
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and o...
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works...
Collection of patches for puppeteer and playwright to avoid automation detection and leaks. Helps to avoid Cloudflare and DataDome CAPTCHA pages. Easy to patch/unpatch, can be enabled/disabled on dema...
A guide for extracting titles, authors, and citations from Google Scholar using Python and Oxylabs SERP Scraper API.
AgentQL is a suite of tools for connecting your AI to the web. Featuring a query language and Playwright integrations for interacting with elements and extracting data quickly, precisely, and at scale...