5 results found Sort:

44
900
apache-2.0
2
E2M converts various file types (doc, docx, epub, html, htm, url, pdf, ppt, pptx, mp3, m4a) into Markdown. It’s easy to install, with dedicated parsers and converters, supporting custom configs. E2M o...
Created 2024-08-04
190 commits to main branch, last one 4 months ago
Easily deployable and scalable backend server that efficiently converts various document formats (pdf, docx, pptx, html, images, etc) into Markdown. With support for both CPU and GPU processing, it is...
Created 2024-11-05
40 commits to main branch, last one a day ago
Parse PDFs into markdown using Vision LLMs
Created 2024-12-16
104 commits to main branch, last one a day ago
Conversion of PDF documents to structured Markdown, optimized for Retrieval Augmented Generation (RAG) and other NLP tasks. Extract text, tables, and images with preserved formatting for enhanced info...
Created 2024-09-10
26 commits to main branch, last one 2 months ago