Statistics for topic web-crawler

RepositoryStats tracks 584,797 Github repositories, of these 70 are tagged with the web-crawler topic. The most common primary language for repositories using this topic is Python (25).

Stargazers over time for topic web-crawler

Most starred repositories for topic web-crawler (view more)

1.4k
18.9k
agpl-3.0
102
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
Created 2024-04-15
2,248 commits to main branch, last one 14 hours ago
670
15.7k
apache-2.0
103
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and o...
Created 2016-08-26
4,692 commits to master branch, last one a day ago
1.8k
11.4k
bsd-3-clause
213
Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台,支持任何语言和框架
Created 2019-02-10
2,674 commits to main branch, last one about a month ago
新一代爬虫平台,以图形化方式定义爬虫流程,不写代码即可完成爬虫。
Created 2020-03-27
482 commits to master branch, last one 2 years ago
A collection of awesome web crawler,spider in different languages
Created 2016-10-10
108 commits to master branch, last one about a year ago
461
5.7k
gpl-3.0
36
Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
Created 2024-06-04
122 commits to main branch, last one 17 days ago

Trending repositories for topic web-crawler (view more)