Trending repositories for topic webscraping
๐ฅ Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
Pydoll is a library for automating chromium-based browsers without a WebDriver, offering realistic interactions.
LLM based autonomous agent that conducts deep local and web research on any topic and generates a long report with citations.
Create agents that monitor and act on your behalf. Your agents are standing by!
๐ฅOpen Source No Code Web Data Extraction Platform. Turn Websites To APIs & Spreadsheets With No-Code Robots In Minutes๐ฅ
List of libraries, tools and APIs for web scraping and data processing.
Undetected Python version of the Playwright testing and automation library.
Undetected NodeJS version of the Playwright testing and automation library.
๐ท๏ธ An undetectable, powerful, flexible, high-performance Python library that makes Web Scraping easy again!
Undetected version of the Playwright testing and automation library.
Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON docu...
Turn Webpage to LLM friendly input text. Similar to Firecrawl and Jina Reader API. Makes RAG, AI web scraping, image & webpage links extraction easy.
A CLI tool to browse, play, and download anime in pt-br (Portuguese)
The web scraping open project repository aims to share knowledge and experiences about web scraping with Python
A Powerful web scraper powered by LLM | OpenAI, Gemini & Ollama
Undetected NodeJS version of the Playwright testing and automation library.
Pydoll is a library for automating chromium-based browsers without a WebDriver, offering realistic interactions.
Undetected Python version of the Playwright testing and automation library.
Turn Webpage to LLM friendly input text. Similar to Firecrawl and Jina Reader API. Makes RAG, AI web scraping, image & webpage links extraction easy.
๐ฅ Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
Undetected version of the Playwright testing and automation library.
A CLI tool to browse, play, and download anime in pt-br (Portuguese)
Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON docu...
LLM based autonomous agent that conducts deep local and web research on any topic and generates a long report with citations.
๐ฅOpen Source No Code Web Data Extraction Platform. Turn Websites To APIs & Spreadsheets With No-Code Robots In Minutes๐ฅ
๐ท๏ธ An undetectable, powerful, flexible, high-performance Python library that makes Web Scraping easy again!
List of libraries, tools and APIs for web scraping and data processing.
The web scraping open project repository aims to share knowledge and experiences about web scraping with Python
A Powerful web scraper powered by LLM | OpenAI, Gemini & Ollama
Create agents that monitor and act on your behalf. Your agents are standing by!
๐ฅ Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
LLM based autonomous agent that conducts deep local and web research on any topic and generates a long report with citations.
Pydoll is a library for automating chromium-based browsers without a WebDriver, offering realistic interactions.
Create agents that monitor and act on your behalf. Your agents are standing by!
๐ฅOpen Source No Code Web Data Extraction Platform. Turn Websites To APIs & Spreadsheets With No-Code Robots In Minutes๐ฅ
Undetected version of the Playwright testing and automation library.
Undetected Python version of the Playwright testing and automation library.
Undetected NodeJS version of the Playwright testing and automation library.
List of libraries, tools and APIs for web scraping and data processing.
๐ท๏ธ An undetectable, powerful, flexible, high-performance Python library that makes Web Scraping easy again!
Scrapoxy is a super proxies manager that orchestrates all your proxies into one place, rather than spreading management across multiple scrapers. It manages IP rotation and fingerprinting, and smartly...
A blazing fast, async-first, undetectable webscraping/web automation framework based on ultrafunkamsterdam/nodriver. Now with Docker support!
Make your job hunt easy by automating your application process with this Auto Applier
Analysis of Bot Protection systems with available countermeasures ๐ฟ. How to defeat anti-bot system ๐ป and get around browser fingerprinting scripts ๐ต๏ธโโ๏ธ when scraping the web?
Turn Webpage to LLM friendly input text. Similar to Firecrawl and Jina Reader API. Makes RAG, AI web scraping, image & webpage links extraction easy.
Undetected NodeJS version of the Playwright testing and automation library.
Undetected Python version of the Playwright testing and automation library.
Pydoll is a library for automating chromium-based browsers without a WebDriver, offering realistic interactions.
Turn Webpage to LLM friendly input text. Similar to Firecrawl and Jina Reader API. Makes RAG, AI web scraping, image & webpage links extraction easy.
Android Automation Framework for Python on emulators (BlissOs, BlueStacks, LDPlayer, Memu, Mumu, Android Studio ...) and rooted devices WITHOUT ADB!
Undetected version of the Playwright testing and automation library.
๐ฅ Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
ListSync automates the import of your IMDB & Trakt lists into Overseerr & Jellyseerr, simplifying your movie management.
A blazing fast, async-first, undetectable webscraping/web automation framework based on ultrafunkamsterdam/nodriver. Now with Docker support!
๐ This is an adapted version of Jina AI's Reader for local deployment using Docker. Convert any URL to an LLM-friendly input with a simple prefix http://127.0.0.1:3000/https://website-to-scrape.com/
It uses selenium to automate Whatsapp for various different functionality like SMS bombing , Simultaneously sending multiple user's same messages, profile opening, status view and more, .
LinkedIn scraper to retrieve and store a live stream of job postings
Make your job hunt easy by automating your application process with this Auto Applier
This code is used to perform web scraping and data extraction from Google Maps. It is particularly designed for obtaining information about businesses, including their name, address, website, phone nu...
ะกะฑะพั ะดะฐะฝะฝัั ั ัะฐะนัะฐ ะพะฑััะฒะปะตะฝะธะน ะฆะธะฐะฝ / The parser of general information from the site cian.ru
Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON docu...
๐ฅ Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
Pydoll is a library for automating chromium-based browsers without a WebDriver, offering realistic interactions.
LLM based autonomous agent that conducts deep local and web research on any topic and generates a long report with citations.
Create agents that monitor and act on your behalf. Your agents are standing by!
๐ฅOpen Source No Code Web Data Extraction Platform. Turn Websites To APIs & Spreadsheets With No-Code Robots In Minutes๐ฅ
๐ท๏ธ An undetectable, powerful, flexible, high-performance Python library that makes Web Scraping easy again!
Undetected Python version of the Playwright testing and automation library.
Undetected version of the Playwright testing and automation library.
A blazing fast, async-first, undetectable webscraping/web automation framework based on ultrafunkamsterdam/nodriver. Now with Docker support!
Make your job hunt easy by automating your application process with this Auto Applier
Undetected NodeJS version of the Playwright testing and automation library.
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
List of libraries, tools and APIs for web scraping and data processing.
A Powerful web scraper powered by LLM | OpenAI, Gemini & Ollama
Analysis of Bot Protection systems with available countermeasures ๐ฟ. How to defeat anti-bot system ๐ป and get around browser fingerprinting scripts ๐ต๏ธโโ๏ธ when scraping the web?
Undetected NodeJS version of the Playwright testing and automation library.
Android Automation Framework for Python on emulators (BlissOs, BlueStacks, LDPlayer, Memu, Mumu, Android Studio ...) and rooted devices WITHOUT ADB!
Undetected Python version of the Playwright testing and automation library.
Turn Webpage to LLM friendly input text. Similar to Firecrawl and Jina Reader API. Makes RAG, AI web scraping, image & webpage links extraction easy.
A blazing fast, async-first, undetectable webscraping/web automation framework based on ultrafunkamsterdam/nodriver. Now with Docker support!
๐ฅ Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
Undetected version of the Playwright testing and automation library.
Introducing UDB: Your One-Stop Solution for Effortless Anime, Drama, Movies, TV Shows Downloads. UDB is a powerful and user-friendly download utility specifically designed for anime, drama, tv-series ...
ListSync automates the import of your IMDB & Trakt lists into Overseerr & Jellyseerr, simplifying your movie management.
๐ This is an adapted version of Jina AI's Reader for local deployment using Docker. Convert any URL to an LLM-friendly input with a simple prefix http://127.0.0.1:3000/https://website-to-scrape.com/
Repository of small data analysis and visualisation projects to try out libraries and create new types of visualisations. Mostly using Python.
AniWorld Downloader is a command-line tool for downloading and streaming anime, series and movies, compatible with Windows, macOS, and Linux. If you like this project, please consider leaving a :star:...
Make your job hunt easy by automating your application process with this Auto Applier
Data Engineering/Scraping Project. Creating a detailed Sports Relational Database for the Top European Soccer Leagues.
Mawaqi Api is a Rest Api for mawaqit.net, the mawaqit.net website gives you the prayer times for more than 8000 mosques around the world, the idea behind this api is to create an api web app that can ...
๐ฅ Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
Pydoll is a library for automating chromium-based browsers without a WebDriver, offering realistic interactions.
๐ท๏ธ An undetectable, powerful, flexible, high-performance Python library that makes Web Scraping easy again!
Official implement of paper "AutoScraper: A Progressive Understanding Web Agent for Web Scraper Generation" [EMNLP 24']
Undetected Python version of the Playwright testing and automation library.
A blazing fast, async-first, undetectable webscraping/web automation framework based on ultrafunkamsterdam/nodriver. Now with Docker support!
Undetected NodeJS version of the Playwright testing and automation library.
๐ This is an adapted version of Jina AI's Reader for local deployment using Docker. Convert any URL to an LLM-friendly input with a simple prefix http://127.0.0.1:3000/https://website-to-scrape.com/
Turn Webpage to LLM friendly input text. Similar to Firecrawl and Jina Reader API. Makes RAG, AI web scraping, image & webpage links extraction easy.
ListSync automates the import of your IMDB & Trakt lists into Overseerr & Jellyseerr, simplifying your movie management.
Android Automation Framework for Python on emulators (BlissOs, BlueStacks, LDPlayer, Memu, Mumu, Android Studio ...) and rooted devices WITHOUT ADB!
AniWorld Downloader is a command-line tool for downloading and streaming anime, series and movies, compatible with Windows, macOS, and Linux. If you like this project, please consider leaving a :star:...
๐ฅ Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
LLM based autonomous agent that conducts deep local and web research on any topic and generates a long report with citations.
๐ฅOpen Source No Code Web Data Extraction Platform. Turn Websites To APIs & Spreadsheets With No-Code Robots In Minutes๐ฅ
Create agents that monitor and act on your behalf. Your agents are standing by!
Pydoll is a library for automating chromium-based browsers without a WebDriver, offering realistic interactions.
๐ท๏ธ An undetectable, powerful, flexible, high-performance Python library that makes Web Scraping easy again!
A Powerful web scraper powered by LLM | OpenAI, Gemini & Ollama
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
List of libraries, tools and APIs for web scraping and data processing.
Make your job hunt easy by automating your application process with this Auto Applier
Undetected version of the Playwright testing and automation library.
Scrapoxy is a super proxies manager that orchestrates all your proxies into one place, rather than spreading management across multiple scrapers. It manages IP rotation and fingerprinting, and smartly...
Official implement of paper "AutoScraper: A Progressive Understanding Web Agent for Web Scraper Generation" [EMNLP 24']
Analysis of Bot Protection systems with available countermeasures ๐ฟ. How to defeat anti-bot system ๐ป and get around browser fingerprinting scripts ๐ต๏ธโโ๏ธ when scraping the web?
A Powerful web scraper powered by LLM | OpenAI, Gemini & Ollama
Make your job hunt easy by automating your application process with this Auto Applier
Undetected Web-Scraping & Seamless HTML Parsing in Python!
Official implement of paper "AutoScraper: A Progressive Understanding Web Agent for Web Scraper Generation" [EMNLP 24']
Android Automation Framework for Python on emulators (BlissOs, BlueStacks, LDPlayer, Memu, Mumu, Android Studio ...) and rooted devices WITHOUT ADB!
Undetected version of the Playwright testing and automation library.
Web scraping tool used to record business addresses, phone numbers, website, supported area and other relevant information of companies from Yelp.com
The Google Scholar Scraper is a Python program that allows users to extract articles from Google Scholar based on the provided title or keyword.
Introducing UDB: Your One-Stop Solution for Effortless Anime, Drama, Movies, TV Shows Downloads. UDB is a powerful and user-friendly download utility specifically designed for anime, drama, tv-series ...
Mawaqi Api is a Rest Api for mawaqit.net, the mawaqit.net website gives you the prayer times for more than 8000 mosques around the world, the idea behind this api is to create an api web app that can ...
This code is used to perform web scraping and data extraction from Google Maps. It is particularly designed for obtaining information about businesses, including their name, address, website, phone nu...
A CLI tool to browse, play, and download anime in pt-br (Portuguese)