helgeho / ArchiveSpark

An Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed at Internet Archive.

Date Created 2015-08-06 (9 years ago)
Commits 144 (last one 10 days ago)
Stargazers 143 (0 this week)
Watchers 14 (0 this week)
Forks 19
License mit
Ranking

RepositoryStats indexes 565,600 repositories, of these helgeho/ArchiveSpark is ranked #221,659 (61st percentile) for total stargazers, and #153,470 for total watchers. Github reports the primary language for this repository as Scala, for repositories using this language it is ranked #869/1,965.

helgeho/ArchiveSpark is also tagged with popular topics, for these it's ranked: spark (#277/522)

Other Information

helgeho/ArchiveSpark has 1 open pull request on Github, 1 pull request has been merged over the lifetime of the repository.

Github issues are enabled, there are 3 open issues and 21 closed issues.

There have been 6 releases, the latest one was published on 2024-06-05 (3 months ago) with the name latest-SNAPSHOT.

Star History

Github stargazers over time

Watcher History

Github watchers over time, collection started in '23

Recent Commit History

5 commits on the default branch (master) since jan '22

Yearly Commits

Commits to the default branch (master) per year

Issue History

Languages

The primary language is Scala but there's also others...

updated: 2024-09-19 @ 10:21pm, id: 40323593 / R_kgDOAmdKCQ