Statistics for topic scraping

RepositoryStats tracks 641,702 Github repositories, of these 386 are tagged with the scraping topic. The most common primary language for repositories using this topic is Python (202). Other languages include: TypeScript (41),  JavaScript (35),  Go (21),  HTML (13),  PHP (13)

Stargazers over time for topic scraping

250250200200150150100100505000202020202021202120222022202320232024202420252025

Most starred repositories for topic scraping (view more)

10.8k
55.0k
bsd-3-clause
1.8k
Scrapy, a fast high-level web crawling & scraping framework for Python.
Created 2010-02-22
10,758 commits to master branch, last one 28 days ago
3.2k
36.4k
agpl-3.0
193
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
Created 2024-04-15
3,330 commits to main branch, last one 2 days ago
AIHawk aims to easy job hunt process by automating the job application process. Utilizing artificial intelligence, it enables users to apply for multiple jobs in a tailored way.
Created 2024-08-04
37 commits to main branch, last one about a month ago
1.8k
24.1k
apache-2.0
327
Elegant Scraper and Crawler Framework for Golang
Created 2017-09-29
685 commits to master branch, last one 24 days ago
Python scraper based on AI
Created 2024-01-27
2,689 commits to main branch, last one 6 days ago
801
17.5k
apache-2.0
108
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and o...
Created 2016-08-26
4,894 commits to master branch, last one 3 days ago

Trending repositories for topic scraping (view more)