Trending repositories for topic data-integration
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
An orchestration platform for the development, production, and observation of data assets.
Turns Data and AI algorithms into production-ready web applications in no time.
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
🧙 Build, run, and manage data pipelines for integrating and transforming data.
Community-curated list of software packages and data resources for single-cell, including RNA-seq, ATAC-seq, etc.
Jitsu is an open-source Segment alternative. Fully-scriptable data ingestion engine for modern data teams. Set-up a real-time data pipeline in minutes, not days
The Common Core Ontology Repository holds the current released version of the Common Core Ontology suite.
Conduit streams data between data stores. Kafka Connect replacement. No JVM required.
Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as ...
Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and co...
ingestr is a CLI tool to copy data between any databases with a single command seamlessly.
The Common Core Ontology Repository holds the current released version of the Common Core Ontology suite.
Conduit streams data between data stores. Kafka Connect replacement. No JVM required.
An orchestration platform for the development, production, and observation of data assets.
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as ...
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
Community-curated list of software packages and data resources for single-cell, including RNA-seq, ATAC-seq, etc.
Jitsu is an open-source Segment alternative. Fully-scriptable data ingestion engine for modern data teams. Set-up a real-time data pipeline in minutes, not days
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Turns Data and AI algorithms into production-ready web applications in no time.
🧙 Build, run, and manage data pipelines for integrating and transforming data.
Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and co...
ingestr is a CLI tool to copy data between any databases with a single command seamlessly.
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
An orchestration platform for the development, production, and observation of data assets.
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
Turns Data and AI algorithms into production-ready web applications in no time.
🧙 Build, run, and manage data pipelines for integrating and transforming data.
Jitsu is an open-source Segment alternative. Fully-scriptable data ingestion engine for modern data teams. Set-up a real-time data pipeline in minutes, not days
A curated list of open source tools used in analytics platforms and data engineering ecosystem
SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).
Community-curated list of software packages and data resources for single-cell, including RNA-seq, ATAC-seq, etc.
Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and co...
ingestr is a CLI tool to copy data between any databases with a single command seamlessly.
Privacy and Security focused Segment-alternative, in Golang and React
Conduit streams data between data stores. Kafka Connect replacement. No JVM required.
Demonstrate integration of Senzing and Neo4j to construct an Entity Resolved Knowledge Graph
A curated list of open source tools used in analytics platforms and data engineering ecosystem
SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).
Framework and command-line tools for integrating FollowTheMoney data streams from multiple sources
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
The Common Core Ontology Repository holds the current released version of the Common Core Ontology suite.
Conduit streams data between data stores. Kafka Connect replacement. No JVM required.
An orchestration platform for the development, production, and observation of data assets.
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
Fast, sensitive and accurate integration of single-cell data with Harmony
Jitsu is an open-source Segment alternative. Fully-scriptable data ingestion engine for modern data teams. Set-up a real-time data pipeline in minutes, not days
Community-curated list of software packages and data resources for single-cell, including RNA-seq, ATAC-seq, etc.
Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and co...
Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as ...
ingestr is a CLI tool to copy data between any databases with a single command seamlessly.
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
An orchestration platform for the development, production, and observation of data assets.
Turns Data and AI algorithms into production-ready web applications in no time.
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
🧙 Build, run, and manage data pipelines for integrating and transforming data.
ingestr is a CLI tool to copy data between any databases with a single command seamlessly.
Jitsu is an open-source Segment alternative. Fully-scriptable data ingestion engine for modern data teams. Set-up a real-time data pipeline in minutes, not days
Community-curated list of software packages and data resources for single-cell, including RNA-seq, ATAC-seq, etc.
Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and co...
A curated list of open source tools used in analytics platforms and data engineering ecosystem
Conduit streams data between data stores. Kafka Connect replacement. No JVM required.
SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).
Privacy and Security focused Segment-alternative, in Golang and React
A curated list of open source tools used in analytics platforms and data engineering ecosystem
Demonstrate integration of Senzing and Neo4j to construct an Entity Resolved Knowledge Graph
Declarative text based tool for data analysts and engineers to extract, load, transform and orchestrate their data pipelines.
Conduit streams data between data stores. Kafka Connect replacement. No JVM required.
The Common Core Ontology Repository holds the current released version of the Common Core Ontology suite.
SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).
A high-performance, extremely flexible, and easily extensible universal workflow engine.
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
An orchestration platform for the development, production, and observation of data assets.
A .NET class library that allows you to import data from different sources into a unified destination
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
ingestr is a CLI tool to copy data between any databases with a single command seamlessly.
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and co...
Database replication platform that leverages change data capture. Stream production data from databases to your data warehouse (Snowflake, BigQuery, Redshift, Databricks) in real-time.
A configuration-driven framework for building Dagster pipelines that enables teams to create and manage data workflows using YAML/JSON instead of code
Turns Data and AI algorithms into production-ready web applications in no time.
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
An orchestration platform for the development, production, and observation of data assets.
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
🧙 Build, run, and manage data pipelines for integrating and transforming data.
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
ingestr is a CLI tool to copy data between any databases with a single command seamlessly.
Community-curated list of software packages and data resources for single-cell, including RNA-seq, ATAC-seq, etc.
Jitsu is an open-source Segment alternative. Fully-scriptable data ingestion engine for modern data teams. Set-up a real-time data pipeline in minutes, not days
Self-contained distributed software platform for building stateful, massively real-time streaming applications in Rust.
Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and co...
Privacy and Security focused Segment-alternative, in Golang and React
A curated list of open source tools used in analytics platforms and data engineering ecosystem
SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).
Self-contained distributed software platform for building stateful, massively real-time streaming applications in Rust.
A high-performance, extremely flexible, and easily extensible universal workflow engine.
A curated list of open source tools used in analytics platforms and data engineering ecosystem
A configuration-driven framework for building Dagster pipelines that enables teams to create and manage data workflows using YAML/JSON instead of code
Declarative text based tool for data analysts and engineers to extract, load, transform and orchestrate their data pipelines.
Turns Data and AI algorithms into production-ready web applications in no time.
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).
The Common Core Ontology Repository holds the current released version of the Common Core Ontology suite.
Building data processing pipelines for documents processing with NLP using Apache NiFi and related services
Conduit streams data between data stores. Kafka Connect replacement. No JVM required.
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
ingestr is a CLI tool to copy data between any databases with a single command seamlessly.
An orchestration platform for the development, production, and observation of data assets.
🎼 Integrate multiple high-dimensional datasets with fuzzy k-means and locally linear adjustments.
Dataplane is an Airflow inspired unified data platform with additional data mesh and RPA capability to automate, schedule and design data pipelines and workflows. Dataplane is written in Golang with a...