Statistics for topic etl
RepositoryStats tracks 584,792 Github repositories, of these 258 are tagged with the etl topic. The most common primary language for repositories using this topic is Python (90). Other languages include: Go (38), Java (30), TypeScript (13), Rust (11)
Stargazers over time for topic etl
Most starred repositories for topic etl (view more)
Trending repositories for topic etl (view more)
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Apache Doris is an easy-to-use, high performance and unified analytics database.
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
🧙 Build, run, and manage data pipelines for integrating and transforming data.
Ylem is an open-source platform for real-time data streaming orchestration
Context-aware structured outputs. Search your documents or the web for specific data and get it back in JSON or Markdown.
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
Apache Doris is an easy-to-use, high performance and unified analytics database.
Zero-ETL, infinite possibilities. Live query APIs, code & more with SQL. No DB required.
Ylem is an open-source platform for real-time data streaming orchestration
Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.
Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
Context-aware structured outputs. Search your documents or the web for specific data and get it back in JSON or Markdown.
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
Ylem is an open-source platform for real-time data streaming orchestration
Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.
Context-aware structured outputs. Search your documents or the web for specific data and get it back in JSON or Markdown.
Visual Data Transformation with Python Code Generation. Low-Code Python-based ETL.
Radient turns many data types (not just text) into vectors for similarity search, RAG, regression analysis, and more.
end-to-end data engineering project to get insights from PyPi using python, duckdb, MotherDuck & Evidence
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
Open source data anonymization and synthetic data orchestration for developers. Create high fidelity synthetic data and sync it across your environments.
An orchestration platform for the development, production, and observation of data assets.
Open source data anonymization and synthetic data orchestration for developers. Create high fidelity synthetic data and sync it across your environments.
Context-aware structured outputs. Search your documents or the web for specific data and get it back in JSON or Markdown.
A curated list of open source tools used in analytics platforms and data engineering ecosystem