13 results found Sort:

1.2k
23.5k
mit
176
🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...
Created 2017-05-05
4,643 commits to dev branch, last one a day ago
34
514
mit
10
Wayback Machine API interface & a command-line tool
Created 2020-05-02
497 commits to master branch, last one 2 years ago
🌐 Guide and tools to run a full offline mirror of Wikipedia.org with three different approaches: Nginx caching proxy, Kiwix + ZIM dump, and MediaWiki/XOWA + XML dump
Created 2019-09-08
37 commits to master branch, last one 3 years ago
😇 A Docker Compose bundle to run on servers with spare CPU, RAM, disk, and bandwidth to help the world. Includes Tor, ArchiveWarrior, BOINC, and more...
Created 2021-04-12
64 commits to main branch, last one 10 months ago
Official ArchiveBox browser extension: automatically/manually preserve your browsing history using ArchiveBox.
Created 2021-06-30
154 commits to master branch, last one 3 days ago
Desktop Electron app for ArchiveBox internet archiver. (ALPHA: not ready for general use)
Created 2020-11-23
58 commits to main branch, last one 2 years ago
5
155
agpl-3.0
5
Navigator for Web Archive
Created 2019-05-27
253 commits to master branch, last one 3 years ago
Scrape posts, threads from forums, news aggregators, mail archives, export to JSONL, mailbox, WARC
Created 2023-02-05
420 commits to develop branch, last one about a year ago
Passively capture, archive, and hoard your web browsing history, including the contents of the pages you visit, for later offline viewing, replay, mirroring, data scraping, and/or indexing. Your own p...
Created 2023-08-20
1,245 commits to master branch, last one about a month ago
⬇️ A simple all-in-one CLI tool to download EVERYTHING from a URL (like youtube-dl/yt-dlp, forum-dl, gallery-dl, simpler ArchiveBox). 🎭 Uses headless Chrome to get HTML, JS, CSS, images/video/audio/s...
Created 2024-10-21
92 commits to main branch, last one 2 months ago
🎭 An introduction to the Internet Archiving ecosystem, tooling, and some of the ethical dilemmas that the community faces.
Created 2020-02-08
17 commits to master branch, last one 7 months ago
Home of the official docker image for ArchiveBox
Created 2020-11-26
34 commits to main branch, last one 3 months ago
Javascript/Node wrapper around Mozilla's Readability library so that ArchiveBox can call it as a oneshot CLI command to extract each page's article text.
Created 2020-08-06
44 commits to main branch, last one 11 months ago