9 results found Sort:

An Awesome List for getting started with web archiving
Created 2017-06-16
143 commits to main branch, last one about a month ago
33
489
mit
11
Wayback Machine API interface & a command-line tool
Created 2020-05-02
497 commits to master branch, last one 2 years ago
WARC + AI - Experimental Retrieval Augmented Generation Pipeline for Web Archive Collections.
Created 2023-10-23
211 commits to main branch, last one about a month ago
15
107
gpl-3.0
20
Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.
Created 2019-09-07
4,429 commits to v1.21.3-at branch, last one about a month ago
Parse And Create Web ARChive (WARC) files with node.js
Created 2017-05-21
114 commits to master branch, last one 5 years ago
A list of things related to software, literature, and other content for 🕣 Memento
Created 2016-09-16
64 commits to main branch, last one 6 months ago
9
57
gpl-3.0
6
A dockerized, queued high fidelity web archiver based on Squidwarc
Created 2018-07-21
34 commits to master branch, last one 5 months ago
9
48
apache-2.0
18
Various Jupyter notebooks about Common Crawl data
Created 2019-07-19
23 commits to main branch, last one 2 years ago
Quick Cache and Archive search buttons
Created 2021-07-10
77 commits to main branch, last one 7 months ago