9 results found Sort:

229
3.0k
apache-2.0
29
Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
Created 2019-04-08
1,507 commits to master branch, last one a day ago
To extract main article from given URL with Node.js
Created 2015-11-29
731 commits to main branch, last one 24 days ago
120
456
apache-2.0
21
Readability / Html Content / Article Extractor & Web Scrapping library written in PHP
This repository has been archived (exclude archived)
Created 2014-09-24
238 commits to master branch, last one 8 months ago
34
149
apache-2.0
10
SmartReader is a library to extract the main content of a web page, based on a port of the Readability library by Mozilla
Created 2017-09-26
369 commits to master branch, last one about a month ago
An article extractor in Rust
Created 2020-04-30
141 commits to master branch, last one 2 years ago
Parse markdown article, download images and replace images URL's with local paths
Created 2019-10-05
154 commits to master branch, last one 9 days ago
17
102
mit
11
Reddit bot to preview and post hyperlinks as comments
Created 2018-12-30
151 commits to master branch, last one 4 years ago
23
92
mit
12
NLP Web Service
Created 2016-12-20
83 commits to master branch, last one 2 years ago
Extract article or news by url or html, parse the title and content, output in markdown format.
Created 2020-09-23
114 commits to master branch, last one 18 days ago