Trending repositories for topic scraper
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。
Create agents that monitor and act on your behalf. Your agents are standing by!
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。
Crawly, a high-level web crawling & scraping framework for Elixir.
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and o...
The fast, flexible, and elegant library for parsing and manipulating HTML and XML.
📖 The most advanced (yet simple) cli manga downloader in the entire universe! Lua scrapers, export formats, anilist integration, fancy TUI and more!
💡 Download the complete source code of any website (including all assets). [ Javascripts, Stylesheets, Images ] using Node.js
Twitter API Scraper | Without an API key | Twitter Internal API | Free | Twitter scraper | Twitter Bot
There was no free Dictionary API on the web when I wanted one for my friend, so I created one.
AV 电影管理系统, avmoo , javbus , javlibrary 爬虫,线上 AV 影片图书馆,AV 磁力链接数据库,Japanese Adult Video Library,Adult Video Magnet Links - Japanese Adult Video Database
A highly efficient, fast, powerful and light-weight anime downloader and streamer for your favorite anime.
Analysis of Bot Protection systems with available countermeasures 🚿. How to defeat anti-bot system 👻 and get around browser fingerprinting scripts 🕵️♂️ when scraping the web?
Script to scrape free and premium Substack posts, saving them as Markdown files. Also generates HTML interfaces to allow you to browse and sort the markdown files for each author.
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
Crawly, a high-level web crawling & scraping framework for Elixir.
This script allows you to automate the creation of Gmail accounts using the Selenium automation framework with the Chrome WebDriver. It navigates through the Gmail sign-up process by filling in the re...
Open Source Node.js script that simplifies scraping media files and messages from Telegram channels, groups, or users, facilitating offline access and storage of images, videos, and documents
This is a Twitter Scraper which uses Selenium for scraping tweets. It is capable of scraping tweets from home, user profile, hashtag, query or search, and advanced searches.
A console application to scrape a valid watching links for any movie or series with exact season and episode number, you can also download a whole season with one click.
Home Assistant custom component for scraping (html, xml or json) multiple values (from a single HTTP request) with a separate sensor/attribute for each value. Support for (login) form-submit functiona...
Web scraper with a simple REST API living in Docker and using a Headless browser and Readability.js for parsing.
Twitter API Scraper | Without an API key | Twitter Internal API | Free | Twitter scraper | Twitter Bot
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and o...
Create agents that monitor and act on your behalf. Your agents are standing by!
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。
Crawly, a high-level web crawling & scraping framework for Elixir.
The fast, flexible, and elegant library for parsing and manipulating HTML and XML.
Twitter API Scraper | Without an API key | Twitter Internal API | Free | Twitter scraper | Twitter Bot
There was no free Dictionary API on the web when I wanted one for my friend, so I created one.
💡 Download the complete source code of any website (including all assets). [ Javascripts, Stylesheets, Images ] using Node.js
📖 The most advanced (yet simple) cli manga downloader in the entire universe! Lua scrapers, export formats, anilist integration, fancy TUI and more!
Node.js library to receive live stream events (comments, gifts, etc.) in realtime from TikTok LIVE.
Python library to receive live stream events (comments, gifts, etc.) in realtime from TikTok LIVE.
Geziyor, blazing fast web crawling & scraping framework for Go. Supports JS rendering.
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
Script to scrape free and premium Substack posts, saving them as Markdown files. Also generates HTML interfaces to allow you to browse and sort the markdown files for each author.
This script allows you to automate the creation of Gmail accounts using the Selenium automation framework with the Chrome WebDriver. It navigates through the Gmail sign-up process by filling in the re...
This is a Twitter Scraper which uses Selenium for scraping tweets. It is capable of scraping tweets from home, user profile, hashtag, query or search, and advanced searches.
Open Source Node.js script that simplifies scraping media files and messages from Telegram channels, groups, or users, facilitating offline access and storage of images, videos, and documents
Java implementation of TikTok-Live-Connector library. Receive live stream events (comments, gifts, etc.) in realtime from TikTok LIVE.
Crawly, a high-level web crawling & scraping framework for Elixir.
Twitter API Scraper | Without an API key | Twitter Internal API | Free | Twitter scraper | Twitter Bot
Home Assistant custom component for scraping (html, xml or json) multiple values (from a single HTTP request) with a separate sensor/attribute for each value. Support for (login) form-submit functiona...
Python script, which empowers people with no programming background to generate robust leads on a mass scale. This repo will be compiled of various versatile techniques used in lead generation.
A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。
Introducing the AmazonMe webscraper - a powerful tool for extracting data from Amazon.com using the Requests and Beautifulsoup library in Python. This scraper allows users to easily navigate and extra...
Implementation of Twitter internal API (Twitter graphql API) in TypeScript
📰 Newspaper4k a fork of the beloved Newspaper3k. Extraction of articles, titles, and metadata from news websites.
A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and o...
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。
Create agents that monitor and act on your behalf. Your agents are standing by!
Twitter API Scraper | Without an API key | Twitter Internal API | Free | Twitter scraper | Twitter Bot
The fast, flexible, and elegant library for parsing and manipulating HTML and XML.
newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:
A collection of awesome web crawler,spider in different languages
There was no free Dictionary API on the web when I wanted one for my friend, so I created one.
Completely free and open-source human-like Instagram bot. Powered by UIAutomator2 and compatible with basically any Android device 5.0+ that can run Instagram - real or emulated.
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
Open Source Node.js script that simplifies scraping media files and messages from Telegram channels, groups, or users, facilitating offline access and storage of images, videos, and documents
An *arr inspired approach to downloading manga using individual sources
Script to scrape free and premium Substack posts, saving them as Markdown files. Also generates HTML interfaces to allow you to browse and sort the markdown files for each author.
This script allows you to automate the creation of Gmail accounts using the Selenium automation framework with the Chrome WebDriver. It navigates through the Gmail sign-up process by filling in the re...
This is a Twitter Scraper which uses Selenium for scraping tweets. It is capable of scraping tweets from home, user profile, hashtag, query or search, and advanced searches.
Twitter API Scraper | Without an API key | Twitter Internal API | Free | Twitter scraper | Twitter Bot
A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。
RowsX is a Chrome extension that performs simple web scraping tasks for business users. It loads data from website tables into spreadsheets. Developed by Rows.com.
Web scraper with a simple REST API living in Docker and using a Headless browser and Readability.js for parsing.
Kemono/Coomer self updating downloader. Can download from a list of users, your website favorites, URLs or usernames you specify.
Python-based web crawling script with randomized intervals, user-agent rotation, and proxy server IP rotation to outsmart website bots and prevent blocking.
Node.js API for obtaining anime information from hianime.to (formerly aniwatch.to) written in TypeScript, made with Cheerio & Axios
Extract information from all games published in Steam thanks to its Web API, and store it in JSON format.
A working vidsrc.to/vidsrc.me extractor as an api. Proof of concept and educational.
🤖Free ChatGPT Line Bot with Horoscope, Music Broadcast, Google Image Search...
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
Twitter API Scraper | Without an API key | Twitter Internal API | Free | Twitter scraper | Twitter Bot
📰 Newspaper4k a fork of the beloved Newspaper3k. Extraction of articles, titles, and metadata from news websites.
🕵️ Unleash Metadata Intelligence with MetaDetective. Your Assistant Beyond Metagoofil.
A simple web scraping plugin for Synology Video Station
YouTube Scraper for effortless public YouTube data collection, including video and channel information.
A lightweight tool for scraping current and historic Google Analytics data
Node.js API for obtaining anime information from hianime.to (formerly aniwatch.to) written in TypeScript, made with Cheerio & Axios
Netflix Scraper for easy collection of titles, descriptions, cast, ratings, and other public data from Netflix.
This is a Twitter Scraper which uses Selenium for scraping tweets. It is capable of scraping tweets from home, user profile, hashtag, query or search, and advanced searches.
Unofficial Claude API supporting direct HTTP chat creation/deletion/retrieval, messages with multiple file attachments and auto session gathering using Firefox with geckodriver.
🤖Free ChatGPT Line Bot with Horoscope, Music Broadcast, Google Image Search...
Java implementation of TikTok-Live-Connector library. Receive live stream events (comments, gifts, etc.) in realtime from TikTok LIVE.
A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
👾 Fast and simple video download library and CLI tool written in Go
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and o...
Create agents that monitor and act on your behalf. Your agents are standing by!
The fast, flexible, and elegant library for parsing and manipulating HTML and XML.
newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:
Twitter API Scraper | Without an API key | Twitter Internal API | Free | Twitter scraper | Twitter Bot
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
Easy to use fansly.com content downloading tool. Written in python, but ships as a standalone Executable App for Windows too. Enjoy your Fansly content offline anytime, anywhere in the highest possibl...
2024! Twitter API scrapper with authorization support. Allows you to scrape search results, User's profiles (followers/following), Tweets (favoriters/retweeters) and more.
There was no free Dictionary API on the web when I wanted one for my friend, so I created one.
2024! Twitter API scrapper with authorization support. Allows you to scrape search results, User's profiles (followers/following), Tweets (favoriters/retweeters) and more.
🤖Free ChatGPT Line Bot with Horoscope, Music Broadcast, Google Image Search...
Scrape content from OnlyFans #onlyfans -- #of-scr -- #onlyfans scrape -- #onlyfans-dl -- OnlyFans content downloader -- #of scrap -- #onlysnap
Web scraper with a simple REST API living in Docker and using a Headless browser and Readability.js for parsing.
Python MLS and Real-Estate Data Scraper for the Realtor.ca Website
Scrape ETF data for free - a python wrapper for scraping ETFDB
Python-based web crawling script with randomized intervals, user-agent rotation, and proxy server IP rotation to outsmart website bots and prevent blocking.
Nodejs library that provides an Api for obtaining the movies information from FlixHQ website.
OpenAPI(Swagger) specification of Twitter Internal API (Twitter graphql API)
ScrapeGPT is a RAG-based Telegram bot designed to scrape and analyze websites, then answer questions based on the scraped content. The bot utilizes Retrieval Augmented Generation and webscraping to re...