Statistics for topic scraping

RepositoryStats tracks 641,702 Github repositories, of these 386 are tagged with the scraping topic. The most common primary language for repositories using this topic is Python (202). Other languages include: TypeScript (41), JavaScript (35), Go (21), HTML (13), PHP (13)

Stargazers over time for topic scraping

Most starred repositories for topic scraping (view more)

scrapy scrapy

10.8k

55.0k

bsd-3-clause

1.8k

Scrapy, a fast high-level web crawling & scraping framework for Python.

python crawler crawling scraping framework web-scraping hacktoberfest web-scraping-python

Created 2010-02-22

10,758 commits to master branch, last one 28 days ago

firecrawl mendableai

3.2k

36.4k

agpl-3.0

193

🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.

ai llm rag data crawler scraper markdown scraping ai-scraping web-crawler webscraping html-to-markdown

Created 2024-04-15

3,330 commits to main branch, last one 2 days ago

Jobs_Applier_AI_Agent_AIHawk feder-cr

4.2k

28.0k

agpl-3.0

190

AIHawk aims to easy job hunt process by automating the job application process. Utilizing artificial intelligence, it enables users to apply for multiple jobs in a tailored way.

Created 2024-08-04

37 commits to main branch, last one about a month ago

colly gocolly

1.8k

24.1k

apache-2.0

327

Elegant Scraper and Crawler Framework for Golang

go golang spider crawler scraper crawling scraping framework

Created 2017-09-29

685 commits to master branch, last one 24 days ago

Scrapegraph-ai ScrapeGraphAI

1.6k

19.3k

mit

129

Python scraper based on AI

ai sc llm gpt-3 gpt-4 llama3 scraping scrapingweb webscraping scraping-python machine-learning automated-scraper

Created 2024-01-27

2,689 commits to main branch, last one 6 days ago

crawlee apify

801

17.5k

apache-2.0

108

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and o...

npm apify nodejs crawler scraper crawling headless scraping puppeteer automation javascript playwright typescript web-crawler web-crawling web-scraping headless-chrome

Created 2016-08-26

4,894 commits to master branch, last one 3 days ago

Statistics for topic scraping

Stargazers over time for topic scraping

Most starred repositories for topic scraping (view more)

Trending repositories for topic scraping (view more)