10 results found Sort:
- Filter by Primary Language:
- Python (5)
- C# (1)
- JavaScript (1)
- PHP (1)
- Rust (1)
- TypeScript (1)
- +
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
Created
2019-04-08
1,581 commits to master branch, last one 7 hours ago
To extract main article from given URL with Node.js
Created
2015-11-29
755 commits to main branch, last one 24 days ago
Readability / Html Content / Article Extractor & Web Scrapping library written in PHP
This repository has been archived
(exclude archived)
Created
2014-09-24
238 commits to master branch, last one about a year ago
SmartReader is a library to extract the main content of a web page, based on a port of the Readability library by Mozilla
Created
2017-09-26
407 commits to master branch, last one about a month ago
An article extractor in Rust
Created
2020-04-30
141 commits to master branch, last one 3 years ago
Parse markdown article, download images and replace images URL's with local paths
Created
2019-10-05
154 commits to master branch, last one 6 months ago
Reddit bot to preview and post hyperlinks as comments
Created
2018-12-30
151 commits to master branch, last one 5 years ago
NLP Web Service
Created
2016-12-20
83 commits to master branch, last one 2 years ago
Extract article or news by url or html, parse the title and content, output in markdown format.
Created
2020-09-23
117 commits to master branch, last one 5 months ago
The best HTML to Markdown library, A esm-native & Useful Utilities with simple, lightweight and epic quality.
Created
2023-11-03
324 commits to main branch, last one about a month ago