Statistics for topic web-crawler

RepositoryStats tracks 595,856 Github repositories, of these 70 are tagged with the web-crawler topic. The most common primary language for repositories using this topic is Python (25).

Stargazers over time for topic web-crawler

Most starred repositories for topic web-crawler (view more)

1.6k
20.4k
agpl-3.0
111
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
Created 2024-04-15
2,473 commits to main branch, last one 20 hours ago
708
16.2k
apache-2.0
105
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and o...
Created 2016-08-26
4,744 commits to master branch, last one 17 hours ago
1.8k
11.4k
bsd-3-clause
214
Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台,支持任何语言和框架
Created 2019-02-10
2,674 commits to main branch, last one 2 months ago
新一代爬虫平台,以图形化方式定义爬虫流程,不写代码即可完成爬虫。
Created 2020-03-27
482 commits to master branch, last one 2 years ago
A collection of awesome web crawler,spider in different languages
Created 2016-10-10
108 commits to master branch, last one about a year ago
473
5.9k
gpl-3.0
36
Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
Created 2024-06-04
122 commits to main branch, last one about a month ago

Trending repositories for topic web-crawler (view more)