Statistics for topic web-scraping

RepositoryStats tracks 633,100 Github repositories, of these 255 are tagged with the web-scraping topic. The most common primary language for repositories using this topic is Python (118). Other languages include: JavaScript (25), Jupyter Notebook (17), TypeScript (15), Go (12), HTML (12)

Stargazers over time for topic web-scraping

Most starred repositories for topic web-scraping (view more)

scrapy scrapy

10.7k

54.7k

bsd-3-clause

1.8k

Scrapy, a fast high-level web crawling & scraping framework for Python.

python crawler crawling scraping framework web-scraping hacktoberfest web-scraping-python

Created 2010-02-22

10,758 commits to master branch, last one 6 days ago

changedetection.io dgtlmoon

1.2k

22.9k

apache-2.0

101

The best and simplest free open source web page change detection, website watcher, restock monitor and notification service. Restock Monitor, change detection. Designed for simplicity - Simply monito...

Created 2021-01-27

1,714 commits to master branch, last one 2 days ago

crawlee apify

781

17.3k

apache-2.0

108

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and o...

npm apify nodejs crawler scraper crawling headless scraping puppeteer automation javascript playwright typescript web-crawler web-crawling web-scraping headless-chrome

Created 2016-08-26

4,865 commits to master branch, last one 22 minutes ago