Trending repositories for topic data-integration
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Turns Data and AI algorithms into production-ready web applications in no time.
ingestr is a CLI tool to copy data between any databases with a single command seamlessly.
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
An orchestration platform for the development, production, and observation of data assets.
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
Community-curated list of software packages and data resources for single-cell, including RNA-seq, ATAC-seq, etc.
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
🧙 Build, run, and manage data pipelines for integrating and transforming data.
Perform historical snapshots without database locks and read change data capture logs from databases. Artie Reader is compatible with Debezium and is written in Go.
A curated list of open source tools used in analytics platforms and data engineering ecosystem
The open source high performance ELT framework powered by Apache Arrow
Perform historical snapshots without database locks and read change data capture logs from databases. Artie Reader is compatible with Debezium and is written in Go.
ingestr is a CLI tool to copy data between any databases with a single command seamlessly.
A curated list of open source tools used in analytics platforms and data engineering ecosystem
NicheNet: predict active ligand-target links between interacting cells
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Turns Data and AI algorithms into production-ready web applications in no time.
An orchestration platform for the development, production, and observation of data assets.
Community-curated list of software packages and data resources for single-cell, including RNA-seq, ATAC-seq, etc.
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
🧙 Build, run, and manage data pipelines for integrating and transforming data.
The open source high performance ELT framework powered by Apache Arrow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Turns Data and AI algorithms into production-ready web applications in no time.
ingestr is a CLI tool to copy data between any databases with a single command seamlessly.
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
An orchestration platform for the development, production, and observation of data assets.
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
🧙 Build, run, and manage data pipelines for integrating and transforming data.
Community-curated list of software packages and data resources for single-cell, including RNA-seq, ATAC-seq, etc.
The open source high performance ELT framework powered by Apache Arrow
Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and co...
Jitsu is an open-source Segment alternative. Fully-scriptable data ingestion engine for modern data teams. Set-up a real-time data pipeline in minutes, not days
SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).
Privacy and Security focused Segment-alternative, in Golang and React
Declarative text based tool for data analysts and engineers to extract, load, transform and orchestrate their data pipelines.
A curated list of open source tools used in analytics platforms and data engineering ecosystem
Self-contained distributed software platform for building stateful, massively real-time streaming applications in Rust.
ingestr is a CLI tool to copy data between any databases with a single command seamlessly.
Declarative text based tool for data analysts and engineers to extract, load, transform and orchestrate their data pipelines.
Perform historical snapshots without database locks and read change data capture logs from databases. Artie Reader is compatible with Debezium and is written in Go.
A curated list of open source tools used in analytics platforms and data engineering ecosystem
Turns Data and AI algorithms into production-ready web applications in no time.
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Self-contained distributed software platform for building stateful, massively real-time streaming applications in Rust.
SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).
NicheNet: predict active ligand-target links between interacting cells
An orchestration platform for the development, production, and observation of data assets.
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
Community-curated list of software packages and data resources for single-cell, including RNA-seq, ATAC-seq, etc.
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and co...
The open source high performance ELT framework powered by Apache Arrow
🧙 Build, run, and manage data pipelines for integrating and transforming data.
Turns Data and AI algorithms into production-ready web applications in no time.
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
An orchestration platform for the development, production, and observation of data assets.
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
ingestr is a CLI tool to copy data between any databases with a single command seamlessly.
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
🧙 Build, run, and manage data pipelines for integrating and transforming data.
The open source high performance ELT framework powered by Apache Arrow
Jitsu is an open-source Segment alternative. Fully-scriptable data ingestion engine for modern data teams. Set-up a real-time data pipeline in minutes, not days
SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).
Community-curated list of software packages and data resources for single-cell, including RNA-seq, ATAC-seq, etc.
A curated list of open source tools used in analytics platforms and data engineering ecosystem
Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and co...
Privacy and Security focused Segment-alternative, in Golang and React
Turns Data and AI algorithms into production-ready web applications in no time.
A curated list of open source tools used in analytics platforms and data engineering ecosystem
Declarative text based tool for data analysts and engineers to extract, load, transform and orchestrate their data pipelines.
ingestr is a CLI tool to copy data between any databases with a single command seamlessly.
Perform historical snapshots without database locks and read change data capture logs from databases. Artie Reader is compatible with Debezium and is written in Go.
SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).
An orchestration platform for the development, production, and observation of data assets.
Self-contained distributed software platform for building stateful, massively real-time streaming applications in Rust.
The Common Core Ontology Repository holds the current released version of the Common Core Ontology suite.
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
🎼 Integrate multiple high-dimensional datasets with fuzzy k-means and locally linear adjustments.
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
ingestr is a CLI tool to copy data between any databases with a single command seamlessly.
A curated list of open source tools used in analytics platforms and data engineering ecosystem
Demonstrate integration of Senzing and Neo4j to construct an Entity Resolved Knowledge Graph
Turns Data and AI algorithms into production-ready web applications in no time.
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
An orchestration platform for the development, production, and observation of data assets.
ingestr is a CLI tool to copy data between any databases with a single command seamlessly.
🧙 Build, run, and manage data pipelines for integrating and transforming data.
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
The open source high performance ELT framework powered by Apache Arrow
Community-curated list of software packages and data resources for single-cell, including RNA-seq, ATAC-seq, etc.
Jitsu is an open-source Segment alternative. Fully-scriptable data ingestion engine for modern data teams. Set-up a real-time data pipeline in minutes, not days
Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and co...
Privacy and Security focused Segment-alternative, in Golang and React
Self-contained distributed software platform for building stateful, massively real-time streaming applications in Rust.
SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).
A curated list of open source tools used in analytics platforms and data engineering ecosystem
Self-contained distributed software platform for building stateful, massively real-time streaming applications in Rust.
A curated list of open source tools used in analytics platforms and data engineering ecosystem
Turns Data and AI algorithms into production-ready web applications in no time.
Declarative text based tool for data analysts and engineers to extract, load, transform and orchestrate their data pipelines.
SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
The Common Core Ontology Repository holds the current released version of the Common Core Ontology suite.
Build REST APIs/Integrations in minutes instead of hours - NF Compose is a (data) integration platform that allows developers to define REST APIs in seconds instead of hours. Generated REST APIs are b...
🎼 Integrate multiple high-dimensional datasets with fuzzy k-means and locally linear adjustments.
🧙 Build, run, and manage data pipelines for integrating and transforming data.
An orchestration platform for the development, production, and observation of data assets.
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
Conduit streams data between data stores. Kafka Connect replacement. No JVM required.
Database replication platform that leverages change data capture. Stream production data from databases to your data warehouse (Snowflake, BigQuery, Redshift, Databricks) in real-time.
Dataplane is an Airflow inspired unified data platform with additional data mesh and RPA capability to automate, schedule and design data pipelines and workflows. Dataplane is written in Golang with a...
First Party data integration solution built for marketing teams to enable audience and conversion onboarding into Google Marketing products (Google Ads, Campaign Manager, Google Analytics).