18 results found Sort:
- Filter by Primary Language:
- Python (9)
- Go (2)
- Elixir (1)
- Java (1)
- Jupyter Notebook (1)
- R (1)
- C# (1)
- TypeScript (1)
- C++ (1)
- +
A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。
Created
2024-02-29
1,654 commits to master branch, last one 19 hours ago
Web Crawler/Spider for NodeJS + server-side jQuery ;-)
Created
2010-11-25
590 commits to master branch, last one 3 months ago
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
Created
2012-10-06
2,628 commits to main branch, last one a day ago
Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
Created
2021-06-21
11,613 commits to main branch, last one a day ago
Extracts data points from images of graphs
Created
2014-11-10
1,880 commits to master branch, last one 2 years ago
Crawly, a high-level web crawling & scraping framework for Elixir.
Created
2019-03-09
320 commits to master branch, last one about a month ago
Extract structured data from web sites. Web sites scraping.
Created
2017-02-09
885 commits to master branch, last one 4 years ago
A simple resume parser used for extracting information from resumes
Created
2018-12-11
52 commits to master branch, last one 3 years ago
Undetected Web-Scraping & Seamless HTML Parsing in Python!
Created
2024-08-04
46 commits to main branch, last one 14 days ago
An R package for acquisition and processing of NASA SMAP data
Created
2016-05-11
304 commits to master branch, last one about a year ago
Library and cli for extracting data from HTML via CSS selectors
Created
2016-01-10
214 commits to master branch, last one about a month ago
FBLYZE is a Facebook scraping system and analysis system.
Created
2016-12-21
233 commits to master branch, last one 5 years ago
Get Lyrics for any songs by just passing in the song name (spelled or misspelled) in less than 2 seconds using this awesome Python Library.
Created
2019-01-14
20 commits to master branch, last one 4 years ago
Extracting and parsing structured data with jQuery Selector, XPath or JsonPath from common web format like HTML, XML and JSON.
Created
2015-12-25
198 commits to master branch, last one 2 years ago
This program extracts insider trading data from the sec website and stores it in excel file for the specified time frame.
Created
2021-01-08
11 commits to master branch, last one 2 years ago
Unofficial Python client for Twitter
This repository has been archived
(exclude archived)
Created
2019-10-14
39 commits to master branch, last one 3 years ago
Extract structured data from any unstructured web page
Created
2024-01-28
22 commits to main branch, last one 7 months ago
A tool to replace data in a Unity Asset Bundle from modified files.
Created
2021-05-24
96 commits to main branch, last one 2 years ago