Statistics for topic data-integration
RepositoryStats tracks 633,100 Github repositories, of these 60 are tagged with the data-integration topic. The most common primary language for repositories using this topic is Python (23).
Stargazers over time for topic data-integration
Most starred repositories for topic data-integration (view more)
Trending repositories for topic data-integration (view more)
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
Turns Data and AI algorithms into production-ready web applications in no time.
🧙 Build, run, and manage data pipelines for integrating and transforming data.
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).
Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and co...
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
🧙 Build, run, and manage data pipelines for integrating and transforming data.
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
An orchestration platform for the development, production, and observation of data assets.
🧙 Build, run, and manage data pipelines for integrating and transforming data.
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
A curated list of open source tools used in analytics platforms and data engineering ecosystem
A high-performance, extremely flexible, and easily extensible universal workflow engine.
WInte.r is a Java framework for end-to-end data integration. The WInte.r framework implements well-known methods for data pre-processing, schema matching, identity resolution, data fusion, and result ...
An Efficient RML-Compliant Engine for Knowledge Graph Construction
Powerful RDF Knowledge Graph Generation with RML Mappings
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
An orchestration platform for the development, production, and observation of data assets.
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
Turns Data and AI algorithms into production-ready web applications in no time.
A curated list of open source tools used in analytics platforms and data engineering ecosystem
The Common Core Ontology Repository holds the current released version of the Common Core Ontology suite.
Declarative text based tool for data analysts and engineers to extract, load, transform and orchestrate their data pipelines.
Conduit streams data between data stores. Kafka Connect replacement. No JVM required.
A configuration-driven framework for building Dagster pipelines that enables teams to create and manage data workflows using YAML/JSON instead of code
Turns Data and AI algorithms into production-ready web applications in no time.
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
An orchestration platform for the development, production, and observation of data assets.
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
Self-contained distributed software platform for building stateful, massively real-time streaming applications in Rust.
A high-performance, extremely flexible, and easily extensible universal workflow engine.
A curated list of open source tools used in analytics platforms and data engineering ecosystem
A configuration-driven framework for building Dagster pipelines that enables teams to create and manage data workflows using YAML/JSON instead of code
Declarative text based tool for data analysts and engineers to extract, load, transform and orchestrate their data pipelines.