11 results found Sort:

1.7k
7.6k
apache-2.0
171
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
Created 2017-08-05
4,039 commits to dev branch, last one 2 days ago
50
2.4k
mit
15
ingestr is a CLI tool to copy data between any databases with a single command seamlessly.
Created 2024-02-12
138 commits to main branch, last one 3 days ago
155
2.3k
apache-2.0
48
Concurrent and multi-stage data ingestion and data processing with Elixir
Created 2018-11-05
395 commits to main branch, last one 8 days ago
834
2.1k
apache-2.0
73
Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
Created 2022-01-12
2,721 commits to master branch, last one 9 hours ago
404
2.0k
apache-2.0
107
Pravega - Streaming as a new software defined storage primitive
Created 2016-07-11
3,294 commits to master branch, last one 3 months ago
10
287
other
4
Orbital automates integration between data sources (APIs, Databases, Queues and Functions). BFF's, API Composition and ETL pipelines that adapt as your specs change.
Created 2022-09-26
4,914 commits to develop branch, last one 7 months ago
28
283
apache-2.0
12
Use SQL to build ELT pipelines on a data lakehouse.
Created 2021-03-11
481 commits to main branch, last one 2 years ago
A Python library that enables ML teams to share, load, and transform data in a collaborative, flexible, and efficient way :chestnut:
Created 2022-02-11
128 commits to main branch, last one about a month ago
The Data Engineering Book - หนังสือวิศวกรรมข้อมูล ของคนไทย เพื่อคนไทย
Created 2021-01-07
226 commits to main branch, last one 8 months ago
Apache Spark examples exclusively in Java
Created 2016-06-26
215 commits to master branch, last one 2 years ago