10 results found Sort:

292
3.4k
apache-2.0
149
Fast, efficient, and scalable distributed map/reduce system, DAG execution, in memory or on disk, written in pure Go, runs standalone or distributedly.
Created 2016-08-26
773 commits to master branch, last one about a month ago
428
2.0k
bsd-3-clause
155
A search engine which can hold 100 trillion lines of log data.
Created 2016-10-20
40 commits to master branch, last one 7 years ago
98
940
apache-2.0
17
Kubernetes-native platform to run massively parallel data/streaming jobs
Created 2022-05-20
914 commits to main branch, last one 15 hours ago
Efficient transducers for Julia
Created 2018-12-23
1,020 commits to master branch, last one about a year ago
Fundamentals of Spark with Python (using PySpark), code examples
Created 2018-08-20
73 commits to master branch, last one 3 years ago
10
314
mit
6
Parallelized Base functions
Created 2020-02-20
188 commits to master branch, last one about a year ago
Data science and Big Data with Python
Created 2016-07-14
31 commits to master branch, last one 9 months ago
15
118
apache-2.0
5
Fast & furious GroupBy operations for dask.array
Created 2021-03-24
587 commits to main branch, last one 2 days ago
Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby
Created 2019-07-20
174 commits to master branch, last one 2 years ago
Data-parallelism on CUDA using Transducers.jl and for loops (FLoops.jl)
Created 2020-10-11
104 commits to master branch, last one 2 years ago