iamarunbrahma / pdf-to-markdown

Conversion of PDF documents to structured Markdown, optimized for Retrieval Augmented Generation (RAG) and other NLP tasks. Extract text, tables, and images with preserved formatting for enhanced information retrieval and processing.

Date Created 2024-09-10 (5 months ago)
Commits 26 (last one 3 months ago)
Stargazers 53 (0 this week)
Watchers 3 (0 this week)
Forks 4
License mit
Ranking

RepositoryStats indexes 618,350 repositories, of these iamarunbrahma/pdf-to-markdown is ranked #464,479 (25th percentile) for total stargazers, and #434,053 for total watchers. Github reports the primary language for this repository as Python, for repositories using this language it is ranked #89,805/125,168.

iamarunbrahma/pdf-to-markdown is also tagged with popular topics, for these it's ranked: python (#18,582/22999),  rag (#453/626),  information-retrieval (#182/223),  retrieval-augmented-generation (#159/215)

Other Information

iamarunbrahma/pdf-to-markdown has 1 open pull request on Github, 0 pull requests have been merged over the lifetime of the repository.

Star History

Github stargazers over time

6060505040403030202010100015 Sep15 SepOct '24Oct '2415 Oct15 OctNov '24Nov '2415 Nov15 NovDec '24Dec '2415 Dec15 DecJan '25Jan '2515 Jan15 JanFeb '25Feb '2515 Feb15 Feb

Watcher History

Github watchers over time, collection started in '23

4444443.53.533333320 Dec20 DecJan '25Jan '2510 Jan10 Jan20 Jan20 JanFeb '25Feb '2510 Feb10 Feb20 Feb20 Feb

Recent Commit History

26 commits on the default branch (main) since jan '22

30302525202015151010550015 Sep15 SepOct '24Oct '2415 Oct15 OctNov '24Nov '2415 Nov15 NovDec '24Dec '2415 Dec15 DecJan '25Jan '2515 Jan15 JanFeb '25Feb '2515 Feb15 Feb

Yearly Commits

Commits to the default branch (main) per year

30302525202015151010550020242024

Issue History

Total Issues
Open Issues
Closed Issues
1111110.50.5000000Dec '24Dec '2415 Dec15 DecJan '25Jan '2515 Jan15 JanFeb '25Feb '2515 Feb15 Feb

Languages

The only known language in this repository is Python

PythonPython

updated: 2025-02-16 @ 09:31pm, id: 855288462 / R_kgDOMvqqjg