10 results found Sort:

289
3.5k
apache-2.0
147
Fast, efficient, and scalable distributed map/reduce system, DAG execution, in memory or on disk, written in pure Go, runs standalone or distributedly.
Created 2016-08-26
780 commits to master branch, last one 10 days ago
428
2.0k
bsd-3-clause
155
A search engine which can hold 100 trillion lines of log data.
Created 2016-10-20
40 commits to master branch, last one 7 years ago
120
1.8k
apache-2.0
22
Kubernetes-native platform to run massively parallel data/streaming jobs
Created 2022-05-20
1,177 commits to main branch, last one 2 days ago
Efficient transducers for Julia
Created 2018-12-23
1,020 commits to master branch, last one about a year ago
Fundamentals of Spark with Python (using PySpark), code examples
Created 2018-08-20
73 commits to master branch, last one 4 years ago
10
322
mit
6
Parallelized Base functions
Created 2020-02-20
188 commits to master branch, last one 2 years ago
Data science and Big Data with Python
Created 2016-07-14
31 commits to master branch, last one about a year ago
18
125
apache-2.0
5
Fast & furious GroupBy operations for dask.array
Created 2021-03-24
622 commits to main branch, last one about a month ago
Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby
Created 2019-07-20
174 commits to master branch, last one 3 years ago
Data-parallelism on CUDA using Transducers.jl and for loops (FLoops.jl)
Created 2020-10-11
104 commits to master branch, last one 2 years ago