iamarunbrahma / pdf-to-markdown

Conversion of PDF documents to structured Markdown, optimized for Retrieval Augmented Generation (RAG) and other NLP tasks. Extract text, tables, and images with preserved formatting for enhanced information retrieval and processing.

Date Created 2024-09-10 (3 months ago)
Commits 26 (last one 26 days ago)
Stargazers 30 (3 this week)
Watchers 3 (0 this week)
Forks 3
License mit
Ranking

RepositoryStats indexes 594,458 repositories, of these iamarunbrahma/pdf-to-markdown is ranked #573,096 (4th percentile) for total stargazers, and #426,934 for total watchers. Github reports the primary language for this repository as Python, for repositories using this language it is ranked #113,943/118,961.

iamarunbrahma/pdf-to-markdown is also tagged with popular topics, for these it's ranked: python (#21,778/22283),  rag (#481/522),  information-retrieval (#215/217),  retrieval-augmented-generation (#171/187)

Star History

Github stargazers over time

Watcher History

Github watchers over time, collection started in '23

Recent Commit History

26 commits on the default branch (main) since jan '22

Yearly Commits

Commits to the default branch (main) per year

Issue History

Languages

The only known language in this repository is Python

updated: 2024-12-17 @ 09:26pm, id: 855288462 / R_kgDOMvqqjg