paulpierre / markdown-crawler

A multithreaded 🕸️ web crawler that recursively crawls a website and creates a 🔽 markdown file for each page, designed for LLM RAG

Date Created 2023-10-24 (about a year ago)
Commits 14 (last one about a year ago)
Stargazers 374 (4 this week)
Watchers 5 (0 this week)
Forks 43
License mit
Ranking

RepositoryStats indexes 636,050 repositories, of these paulpierre/markdown-crawler is ranked #118,740 (81st percentile) for total stargazers, and #325,225 for total watchers. Github reports the primary language for this repository as Python, for repositories using this language it is ranked #20,541/130,195.

paulpierre/markdown-crawler is also tagged with popular topics, for these it's ranked: llm (#1,033/3629),  markdown (#627/2070),  rag (#240/709),  llmops (#89/182)

Other Information

paulpierre/markdown-crawler has 4 open pull requests on Github, 0 pull requests have been merged over the lifetime of the repository.

Github issues are enabled, there are 10 open issues and 1 closed issue.

There have been 2 releases, the latest one was published on 2023-10-24 (about a year ago) with the name markdown-crawl-0.0.4.

Homepage URL: https://pypi.org/project/markdown-crawler/

Star History

Github stargazers over time

400400350350300300250250200200150150100100505000Nov '23Nov '23Dec '23Dec '2320242024Feb '24Feb '24Mar '24Mar '24Apr '24Apr '24May '24May '24Jun '24Jun '24Jul '24Jul '24Aug '24Aug '24Sep '24Sep '24Oct '24Oct '24Nov '24Nov '24Dec '24Dec '2420252025Feb '25Feb '25Mar '25Mar '25Apr '25Apr '25

Watcher History

Github watchers over time, collection started in '23

554.54.5443.53.5332.52.522Dec '23Dec '2320242024Feb '24Feb '24Mar '24Mar '24Apr '24Apr '24May '24May '24Jun '24Jun '24Jul '24Jul '24Aug '24Aug '24Sep '24Sep '24Oct '24Oct '24Nov '24Nov '24Dec '24Dec '2420252025Feb '25Feb '25Mar '25Mar '25Apr '25Apr '25

Recent Commit History

14 commits on the default branch (main) since jan '22

1414121210108866442200Nov '23Nov '23Dec '23Dec '2320242024Feb '24Feb '24Mar '24Mar '24Apr '24Apr '24May '24May '24Jun '24Jun '24Jul '24Jul '24Aug '24Aug '24Sep '24Sep '24Oct '24Oct '24Nov '24Nov '24Dec '24Dec '2420252025Feb '25Feb '25Mar '25Mar '25Apr '25Apr '25

Yearly Commits

Commits to the default branch (main) per year

2222111111000020242024

Issue History

Total Issues
Open Issues
Closed Issues
121210108866442200Dec '23Dec '2320242024Feb '24Feb '24Mar '24Mar '24Apr '24Apr '24May '24May '24Jun '24Jun '24Jul '24Jul '24Aug '24Aug '24Sep '24Sep '24Oct '24Oct '24Nov '24Nov '24Dec '24Dec '2420252025Feb '25Feb '25Mar '25Mar '25Apr '25Apr '25

Languages

The only known language in this repository is Python

PythonPython

updated: 2025-04-06 @ 05:52am, id: 709317996 / R_kgDOKkdVbA