Trending repositories for topic web-scraping
🔥 Open-source no-code web data extraction platform. Turn websites to APIs and spreadsheets with no-code robots in minutes! [In Beta]
The best and simplest free open source web page change detection, website watcher, restock monitor and notification service. Restock Monitor, change detection. Designed for simplicity - Simply monito...
Undetectable, Lightning-Fast, and Adaptive Web Scraping for Python
High-performance HTML5 parser for Ruby based on Lexbor, with support for both CSS selectors and XPath.
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and o...
Scrapy, a fast high-level web crawling & scraping framework for Python.
AgentQL is an AI-powered query language for web scraping and automation. It uses natural language selectors to find data on any page, including authenticated content. AgentQL queries are self-healing ...
Python binding for curl-impersonate fork via cffi. A http client that can impersonate browser tls/ja3/http2 fingerprints.
Python binding to Modest and Lexbor engines (fast HTML5 parser with CSS selectors).
📊 Blazing fast Python framework for web crawling, scraping, testing, and reporting. Supports pytest. Stealth abilities: UC Mode and CDP Mode.
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works...
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
scrape data data from Google Maps. Extracts data such as the name, address, phone number, website URL, rating, reviews number, latitude and longitude, reviews,email and more for each place
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
List of libraries, tools and APIs for web scraping and data processing.
Undetected Python version of the Playwright testing and automation library.
High-performance HTML5 parser for Ruby based on Lexbor, with support for both CSS selectors and XPath.
Undetected Python version of the Playwright testing and automation library.
AgentQL is an AI-powered query language for web scraping and automation. It uses natural language selectors to find data on any page, including authenticated content. AgentQL queries are self-healing ...
Undetectable, Lightning-Fast, and Adaptive Web Scraping for Python
🔥 Open-source no-code web data extraction platform. Turn websites to APIs and spreadsheets with no-code robots in minutes! [In Beta]
Python binding to Modest and Lexbor engines (fast HTML5 parser with CSS selectors).
Undetected Web-Scraping & Seamless HTML Parsing in Python!
Python binding for curl-impersonate fork via cffi. A http client that can impersonate browser tls/ja3/http2 fingerprints.
scrape data data from Google Maps. Extracts data such as the name, address, phone number, website URL, rating, reviews number, latitude and longitude, reviews,email and more for each place
A fork of Dragnet that also extract author, headline, date, keywords from context, as well as built in metadata extraction all in one package
The best and simplest free open source web page change detection, website watcher, restock monitor and notification service. Restock Monitor, change detection. Designed for simplicity - Simply monito...
Collection of patches for puppeteer and playwright to avoid automation detection and leaks. Helps to avoid Cloudflare and DataDome CAPTCHA pages. Easy to patch/unpatch, can be enabled/disabled on dema...
Web scraper with a simple REST API living in Docker and using a Headless browser and Readability.js for parsing.
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。
Uscrapper Vanta: Dive deeper into the web with this powerful open-source tool. Extract valuable insights with ease and efficiency, from both surface and deep web sources. Empower your data mining and ...
Undetectable, Lightning-Fast, and Adaptive Web Scraping for Python
🔥 Open-source no-code web data extraction platform. Turn websites to APIs and spreadsheets with no-code robots in minutes! [In Beta]
The best and simplest free open source web page change detection, website watcher, restock monitor and notification service. Restock Monitor, change detection. Designed for simplicity - Simply monito...
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and o...
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。
Scrapy, a fast high-level web crawling & scraping framework for Python.
High-performance HTML5 parser for Ruby based on Lexbor, with support for both CSS selectors and XPath.
AgentQL is an AI-powered query language for web scraping and automation. It uses natural language selectors to find data on any page, including authenticated content. AgentQL queries are self-healing ...
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works...
Python binding for curl-impersonate fork via cffi. A http client that can impersonate browser tls/ja3/http2 fingerprints.
📊 Blazing fast Python framework for web crawling, scraping, testing, and reporting. Supports pytest. Stealth abilities: UC Mode and CDP Mode.
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
Python binding to Modest and Lexbor engines (fast HTML5 parser with CSS selectors).
Collection of patches for puppeteer and playwright to avoid automation detection and leaks. Helps to avoid Cloudflare and DataDome CAPTCHA pages. Easy to patch/unpatch, can be enabled/disabled on dema...
scrape data data from Google Maps. Extracts data such as the name, address, phone number, website URL, rating, reviews number, latitude and longitude, reviews,email and more for each place
List of libraries, tools and APIs for web scraping and data processing.
Undetectable, Lightning-Fast, and Adaptive Web Scraping for Python
Undetected Python version of the Playwright testing and automation library.
High-performance HTML5 parser for Ruby based on Lexbor, with support for both CSS selectors and XPath.
AgentQL is an AI-powered query language for web scraping and automation. It uses natural language selectors to find data on any page, including authenticated content. AgentQL queries are self-healing ...
🔥 Open-source no-code web data extraction platform. Turn websites to APIs and spreadsheets with no-code robots in minutes! [In Beta]
🪞PRIMP (Python Requests IMPersonate). The fastest python HTTP client that can impersonate web browsers
GroqCrawl is a powerful and user-friendly web crawling and scraping application built with Streamlit and powered by PocketGroq. It provides an intuitive interface for extracting LLM friendly AI consum...
Collection of patches for puppeteer and playwright to avoid automation detection and leaks. Helps to avoid Cloudflare and DataDome CAPTCHA pages. Easy to patch/unpatch, can be enabled/disabled on dema...
Provides a list of fresh, working proxy servers (HTTP, HTTPS, SOCKS4 & SOCKS5) with multiple formats available for download.
Undetected Web-Scraping & Seamless HTML Parsing in Python!
📥 Bot for downloading any media from Instagram, Twitter and videos from TikTok and Youtube
PocketGroq is a powerful Python library that simplifies integration with the Groq API, offering advanced features for natural language processing, web scraping, and autonomous agent capabilities. Key ...
Python binding for curl-impersonate fork via cffi. A http client that can impersonate browser tls/ja3/http2 fingerprints.
scrape data data from Google Maps. Extracts data such as the name, address, phone number, website URL, rating, reviews number, latitude and longitude, reviews,email and more for each place
A fork of Dragnet that also extract author, headline, date, keywords from context, as well as built in metadata extraction all in one package
A basic python 3 based web scraper for extracting reviews from Amazon. Built using Selectorlib and requests
Python binding to Modest and Lexbor engines (fast HTML5 parser with CSS selectors).
Undetected Python version of the Playwright testing and automation library.
🔥 Open-source no-code web data extraction platform. Turn websites to APIs and spreadsheets with no-code robots in minutes! [In Beta]
The best and simplest free open source web page change detection, website watcher, restock monitor and notification service. Restock Monitor, change detection. Designed for simplicity - Simply monito...
Undetectable, Lightning-Fast, and Adaptive Web Scraping for Python
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works...
Scrapy, a fast high-level web crawling & scraping framework for Python.
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and o...
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
Python binding for curl-impersonate fork via cffi. A http client that can impersonate browser tls/ja3/http2 fingerprints.
📊 Blazing fast Python framework for web crawling, scraping, testing, and reporting. Supports pytest. Stealth abilities: UC Mode and CDP Mode.
AgentQL is an AI-powered query language for web scraping and automation. It uses natural language selectors to find data on any page, including authenticated content. AgentQL queries are self-healing ...
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
Collection of patches for puppeteer and playwright to avoid automation detection and leaks. Helps to avoid Cloudflare and DataDome CAPTCHA pages. Easy to patch/unpatch, can be enabled/disabled on dema...
High-performance HTML5 parser for Ruby based on Lexbor, with support for both CSS selectors and XPath.
scrape data data from Google Maps. Extracts data such as the name, address, phone number, website URL, rating, reviews number, latitude and longitude, reviews,email and more for each place
List of libraries, tools and APIs for web scraping and data processing.
🔥 Open-source no-code web data extraction platform. Turn websites to APIs and spreadsheets with no-code robots in minutes! [In Beta]
Undetectable, Lightning-Fast, and Adaptive Web Scraping for Python
AgentQL is an AI-powered query language for web scraping and automation. It uses natural language selectors to find data on any page, including authenticated content. AgentQL queries are self-healing ...
A drop-in replacement for playwright-python patched with rebrowser-patches. It allows to pass modern automation detection tests.
GroqCrawl is a powerful and user-friendly web crawling and scraping application built with Streamlit and powered by PocketGroq. It provides an intuitive interface for extracting LLM friendly AI consum...
High-performance HTML5 parser for Ruby based on Lexbor, with support for both CSS selectors and XPath.
Collection of patches for puppeteer and playwright to avoid automation detection and leaks. Helps to avoid Cloudflare and DataDome CAPTCHA pages. Easy to patch/unpatch, can be enabled/disabled on dema...
Modern tests to detect automated browser behavior. Cover most important leaks from Puppeteer and Playwright.
With this tool, users can easily add credit card details, view saved cards, check card balances, and retrieve OTPs for verification
🪞PRIMP (Python Requests IMPersonate). The fastest python HTTP client that can impersonate web browsers
📥 Bot for downloading any media from Instagram, Twitter and videos from TikTok and Youtube
PocketGroq is a powerful Python library that simplifies integration with the Groq API, offering advanced features for natural language processing, web scraping, and autonomous agent capabilities. Key ...
A huge collection of awesome beginner-friendly Python projects starting from very basics to advance. Prefect repository for learning python and enhancing your python programming skills.
The best and simplest free open source web page change detection, website watcher, restock monitor and notification service. Restock Monitor, change detection. Designed for simplicity - Simply monito...
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works...
Provides a list of fresh, working proxy servers (HTTP, HTTPS, SOCKS4 & SOCKS5) with multiple formats available for download.
A tutorial for web scraping using Playwright headless browser
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works...
Undetectable, Lightning-Fast, and Adaptive Web Scraping for Python
A guide for extracting titles, authors, and citations from Google Scholar using Python and Oxylabs SERP Scraper API.
Collection of patches for puppeteer and playwright to avoid automation detection and leaks. Helps to avoid Cloudflare and DataDome CAPTCHA pages. Easy to patch/unpatch, can be enabled/disabled on dema...
AI agent that can SEE 👁️, control, navigate, & do stuff for you on your browser.
AgentQL is an AI-powered query language for web scraping and automation. It uses natural language selectors to find data on any page, including authenticated content. AgentQL queries are self-healing ...
PocketGroq is a powerful Python library that simplifies integration with the Groq API, offering advanced features for natural language processing, web scraping, and autonomous agent capabilities. Key ...
Code Integrations for our Swiss Quality Residential Proxies in multiple Programming Languages.
Save web pages as Safari webarchive files from the command line
🪞PRIMP (Python Requests IMPersonate). The fastest python HTTP client that can impersonate web browsers
A curated list with useful Python programming tools and libraries, as well as other noteworthy resources.
aiohttp-like interface to chromium. based on selenium_driverless to bypass cloudflare
GroqCrawl is a powerful and user-friendly web crawling and scraping application built with Streamlit and powered by PocketGroq. It provides an intuitive interface for extracting LLM friendly AI consum...
Undetected Python version of the Playwright testing and automation library.
With this tool, users can easily add credit card details, view saved cards, check card balances, and retrieve OTPs for verification
The best and simplest free open source web page change detection, website watcher, restock monitor and notification service. Restock Monitor, change detection. Designed for simplicity - Simply monito...
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and o...
🔥 Open-source no-code web data extraction platform. Turn websites to APIs and spreadsheets with no-code robots in minutes! [In Beta]
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works...
Scrapy, a fast high-level web crawling & scraping framework for Python.
📊 Blazing fast Python framework for web crawling, scraping, testing, and reporting. Supports pytest. Stealth abilities: UC Mode and CDP Mode.
Python binding for curl-impersonate fork via cffi. A http client that can impersonate browser tls/ja3/http2 fingerprints.
Undetectable, Lightning-Fast, and Adaptive Web Scraping for Python
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
Free Trial Amazon Scraper API for extracting search, product, offer listing, reviews, question and answers, best sellers and sellers data.
scrape data data from Google Maps. Extracts data such as the name, address, phone number, website URL, rating, reviews number, latitude and longitude, reviews,email and more for each place
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
HTTP(S)/SOCKS5 rotating residential proxies - code examples & general information.
List of libraries, tools and APIs for web scraping and data processing.
Python quick start guides to get the most out of Oxylabs' Web Scraper API free trial.
Collection of patches for puppeteer and playwright to avoid automation detection and leaks. Helps to avoid Cloudflare and DataDome CAPTCHA pages. Easy to patch/unpatch, can be enabled/disabled on dema...
AI agent that can SEE 👁️, control, navigate, & do stuff for you on your browser.
PocketGroq is a powerful Python library that simplifies integration with the Groq API, offering advanced features for natural language processing, web scraping, and autonomous agent capabilities. Key ...
Python quick start guides to get the most out of Oxylabs' Web Scraper API free trial.
A huge collection of awesome beginner-friendly Python projects starting from very basics to advance. Prefect repository for learning python and enhancing your python programming skills.
Undetected Web-Scraping & Seamless HTML Parsing in Python!
Provides a list of fresh, working proxy servers (HTTP, HTTPS, SOCKS4 & SOCKS5) with multiple formats available for download.
scrape data data from Google Maps. Extracts data such as the name, address, phone number, website URL, rating, reviews number, latitude and longitude, reviews,email and more for each place
In this Python Web Scraping Tutorial, we will outline everything needed to get started with web scraping. We will begin with simple examples and move on to relatively more complex.
GroqCrawl is a powerful and user-friendly web crawling and scraping application built with Streamlit and powered by PocketGroq. It provides an intuitive interface for extracting LLM friendly AI consum...
Free Trial Amazon Scraper API for extracting search, product, offer listing, reviews, question and answers, best sellers and sellers data.
📥 Bot for downloading any media from Instagram, Twitter and videos from TikTok and Youtube
Code Integrations for our Swiss Quality Residential Proxies in multiple Programming Languages.
HTTP(S)/SOCKS5 rotating residential proxies - code examples & general information.
A curated list with useful Python programming tools and libraries, as well as other noteworthy resources.