43 results found Sort:

13.8k
35.2k
apache-2.0
760
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Created 2015-04-13
24,892 commits to main branch, last one 17 hours ago
3.8k
14.7k
other
180
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
Created 2020-07-27
17,182 commits to master branch, last one 18 hours ago
3.1k
11.8k
apache-2.0
282
Apache Doris is an easy-to-use, high performance and unified analytics database.
Created 2017-08-10
20,184 commits to master branch, last one 5 hours ago
1.5k
9.2k
apache-2.0
137
dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
Created 2016-03-10
6,757 commits to main branch, last one 19 hours ago
1.7k
7.6k
apache-2.0
172
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
Created 2017-08-05
4,035 commits to dev branch, last one 16 hours ago
671
7.4k
apache-2.0
61
🧙 Build, run, and manage data pipelines for integrating and transforming data.
Created 2022-05-16
5,348 commits to master branch, last one 23 hours ago
407
7.0k
apache-2.0
61
Infinitely scalable, event-driven, language-agnostic orchestration and scheduling platform to manage millions of workflows declaratively in code.
Created 2019-08-24
3,371 commits to develop branch, last one 5 hours ago
500
5.7k
mpl-2.0
59
The open source high performance ELT framework powered by Apache Arrow
Created 2020-11-18
17,966 commits to main branch, last one 7 hours ago
1.8k
5.4k
apache-2.0
142
Flink CDC is a streaming data integration tool
Created 2020-07-27
1,002 commits to master branch, last one 14 hours ago
36
2.1k
apache-2.0
13
Open-source BI for engineers
Created 2024-02-20
297 commits to main branch, last one a day ago
112
1.9k
apache-2.0
19
data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
Created 2022-01-26
3,049 commits to devel branch, last one 2 days ago
144
1.7k
mit
13
Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
Created 2021-06-21
11,398 commits to main branch, last one a day ago
119
1.5k
apache-2.0
18
Efficient data transformation and modeling framework that is backwards compatible with dbt.
Created 2022-09-23
2,544 commits to main branch, last one 12 hours ago
149
802
apache-2.0
20
Dataform is a framework for managing SQL based data operations in BigQuery
Created 2018-09-03
1,682 commits to main branch, last one 7 hours ago
52
776
apache-2.0
13
Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as ...
Created 2021-04-08
416 commits to master branch, last one about a year ago
153
741
apache-2.0
18
Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data quality management.
Created 2021-03-22
487 commits to main branch, last one 11 months ago
Database replication platform that leverages change data capture. Stream production data from databases to your data warehouse (Snowflake, BigQuery, Redshift) in real-time.
Created 2022-11-06
663 commits to master branch, last one 19 hours ago
114
476
apache-2.0
26
A free to use dbt package for creating and loading Data Vault 2.0 compliant Data Warehouses (powered by dbt, an open source data engineering tool, registered trademark of dbt Labs)
Created 2019-09-27
4,037 commits to master branch, last one 19 days ago
dbt + Metabase integration
Created 2019-12-12
161 commits to master branch, last one 6 days ago
54
416
apache-2.0
16
One framework to develop, deploy and operate data workflows with Python and SQL.
Created 2021-07-20
2,134 commits to main branch, last one 13 days ago
94
377
apache-2.0
25
ReplicaDB is open source tool for database replication, designed for efficiently transferring bulk data between relational and non-relational databases
Created 2018-12-05
442 commits to master branch, last one 2 months ago
39
328
apache-2.0
12
Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.
Created 2021-12-06
1,162 commits to main branch, last one 5 days ago
A serverless architecture for orchestrating ETL jobs in arbitrarily-complex workflows using AWS Step Functions and AWS Lambda.
Created 2018-05-08
14 commits to master branch, last one 4 years ago
Sling is a CLI tool that extracts data from a source storage/database and loads it in a target storage/database.
Created 2020-10-15
1,365 commits to main branch, last one 22 days ago
28
283
apache-2.0
12
Use SQL to build ELT pipelines on a data lakehouse.
Created 2021-03-11
481 commits to main branch, last one 2 years ago
15
225
apache-2.0
10
CLI tool for dbt users to simplify creation of staging models (yml and sql) files
Created 2021-06-28
864 commits to main branch, last one 2 days ago
Play detective on Reddit: Discover political disinformation campaigns, secret influencers and more
Created 2020-10-20
102 commits to main branch, last one 2 years ago
3
197
apache-2.0
7
Data Reconnaissance - pull request review tool for dbt projects
Created 2023-10-06
1,194 commits to main branch, last one 9 hours ago
PyAirbyte brings the power of Airbyte to every Python developer.
Created 2024-02-04
155 commits to main branch, last one 5 days ago