Statistics for topic web-crawler
RepositoryStats tracks 579,129 Github repositories, of these 70 are tagged with the web-crawler topic. The most common primary language for repositories using this topic is Python (25).
Stargazers over time for topic web-crawler
Most starred repositories for topic web-crawler (view more)
Trending repositories for topic web-crawler (view more)
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works...
Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
CLI tool for saving a faithful copy of a complete web page in a single HTML file (based on SingleFile)
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works...
Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
CLI tool for saving a faithful copy of a complete web page in a single HTML file (based on SingleFile)
GroqCrawl is a powerful and user-friendly web crawling and scraping application built with Streamlit and powered by PocketGroq. It provides an intuitive interface for extracting LLM friendly AI consum...
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and o...
Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works...
GroqCrawl is a powerful and user-friendly web crawling and scraping application built with Streamlit and powered by PocketGroq. It provides an intuitive interface for extracting LLM friendly AI consum...
🔥 This repository contains complete application examples, including websites and other projects, developed using Firecrawl.
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works...
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and o...
Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works...
Undetected Web-Scraping & Seamless HTML Parsing in Python!
"Enhancing LLM Factual Accuracy with RAG to Counter Hallucinations: A Case Study on Domain-Specific Queries in Private Knowledge-Bases" by Jiarui Li and Ye Yuan and Zehua Zhang
GroqCrawl is a powerful and user-friendly web crawling and scraping application built with Streamlit and powered by PocketGroq. It provides an intuitive interface for extracting LLM friendly AI consum...