Statistics for topic web-crawler

RepositoryStats tracks 518,989 Github repositories, of these 57 are tagged with the web-crawler topic. The most common primary language for repositories using this topic is Python (17).

Stargazers over time for topic web-crawler

Most starred repositories for topic web-crawler (view more)

530
12.3k
apache-2.0
94
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and o...
Created 2016-08-26
4,353 commits to master branch, last one 15 hours ago
1.7k
10.9k
bsd-3-clause
211
Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台,支持任何语言和框架
Created 2019-02-10
2,558 commits to main branch, last one 6 months ago
新一代爬虫平台,以图形化方式定义爬虫流程,不写代码即可完成爬虫。
Created 2020-03-27
482 commits to master branch, last one about a year ago
A collection of awesome web crawler,spider in different languages
Created 2016-10-10
108 commits to master branch, last one 6 months ago
241
2.9k
agpl-3.0
20
🔥 Turn entire websites into LLM-ready markdown
Created 2024-04-15
390 commits to main branch, last one a day ago
1.2k
2.8k
apache-2.0
237
Apache Nutch is an extensible and scalable web crawler
Created 2009-05-21
3,439 commits to master branch, last one 2 days ago

Trending repositories for topic web-crawler (view more)