Statistics for topic web-crawler

RepositoryStats tracks 632,784 Github repositories, of these 74 are tagged with the web-crawler topic. The most common primary language for repositories using this topic is Python (28).

Stargazers over time for topic web-crawler

Most starred repositories for topic web-crawler (view more)

firecrawl mendableai

2.9k

33.4k

agpl-3.0

176

🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.

ai llm rag data crawler scraper markdown scraping ai-scraping web-crawler webscraping html-to-markdown

Created 2024-04-15

3,110 commits to main branch, last one a day ago

crawlee apify

780

17.3k

apache-2.0

108

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and o...

npm apify nodejs crawler scraper crawling headless scraping puppeteer automation javascript playwright typescript web-crawler web-crawling web-scraping headless-chrome

Created 2016-08-26

4,864 commits to master branch, last one 24 hours ago