11 results found Sort:
- Filter by Primary Language:
- Java (4)
- Python (3)
- JavaScript (2)
- Elixir (1)
- TypeScript (1)
- +
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
Created
2017-08-05
4,039 commits to dev branch, last one 2 days ago
ingestr is a CLI tool to copy data between any databases with a single command seamlessly.
Created
2024-02-12
138 commits to main branch, last one 3 days ago
Concurrent and multi-stage data ingestion and data processing with Elixir
Created
2018-11-05
395 commits to main branch, last one 8 days ago
Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
Created
2022-01-12
2,721 commits to master branch, last one 9 hours ago
Pravega - Streaming as a new software defined storage primitive
Created
2016-07-11
3,294 commits to master branch, last one 3 months ago
Orbital automates integration between data sources (APIs, Databases, Queues and Functions). BFF's, API Composition and ETL pipelines that adapt as your specs change.
Created
2022-09-26
4,914 commits to develop branch, last one 7 months ago
Use SQL to build ELT pipelines on a data lakehouse.
Created
2021-03-11
481 commits to main branch, last one 2 years ago
A Python library that enables ML teams to share, load, and transform data in a collaborative, flexible, and efficient way :chestnut:
Created
2022-02-11
128 commits to main branch, last one about a month ago
The Data Engineering Book - หนังสือวิศวกรรมข้อมูล ของคนไทย เพื่อคนไทย
Created
2021-01-07
226 commits to main branch, last one 8 months ago
Apache Spark examples exclusively in Java
Created
2016-06-26
215 commits to master branch, last one 2 years ago
Squirrel dataset hub
Created
2022-02-01
118 commits to main branch, last one about a year ago