vishwajeetdabholkar / eGet-Crawler-for-ai

Web scraping framework built for AI applications. Extract clean, structured content from any website with dynamic content handling, markdown conversion, and intelligent crawling capabilities. Perfect for RAG applications and AI training data pipelines. Features async processing, browser management, and Prometheus monitoring.

Date Created 2024-11-15 (4 months ago)
Commits 40 (last one 24 days ago)
Stargazers 42 (0 this week)
Watchers 2 (0 this week)
Forks 15
License apache-2.0
Ranking

RepositoryStats indexes 630,031 repositories, of these vishwajeetdabholkar/eGet-Crawler-for-ai is ranked #533,152 (15th percentile) for total stargazers, and #480,376 for total watchers. Github reports the primary language for this repository as Python, for repositories using this language it is ranked #105,351/128,534.

vishwajeetdabholkar/eGet-Crawler-for-ai is also tagged with popular topics, for these it's ranked: markdown (#1,851/2058),  pdf (#985/1072),  rag (#555/679),  knowledge-base (#141/163)

Star History

Github stargazers over time

454540403535303025252020151510105500Dec '24Dec '2415 Dec15 DecJan '25Jan '2515 Jan15 JanFeb '25Feb '2515 Feb15 FebMar '25Mar '2515 Mar15 Mar

Watcher History

Github watchers over time, collection started in '23

2222221.51.511111115 Dec15 DecJan '25Jan '2515 Jan15 JanFeb '25Feb '2515 Feb15 FebMar '25Mar '2515 Mar15 Mar

Recent Commit History

40 commits on the default branch (main) since jan '22

4040353530302525202015151010550015 Nov15 NovDec '24Dec '2415 Dec15 DecJan '25Jan '2515 Jan15 JanFeb '25Feb '2515 Feb15 FebMar '25Mar '2515 Mar15 Mar

Yearly Commits

Commits to the default branch (main) per year

12121010886644220020242024

Issue History

No issues have been posted

Languages

The primary language is Python but there's also others...

PythonPythonHTMLHTMLDockerfileDockerfile

updated: 2025-03-01 @ 09:17am, id: 888802241 / R_kgDONPoLwQ