Statistics for topic data-pipelines
RepositoryStats tracks 635,088 Github repositories, of these 60 are tagged with the data-pipelines topic. The most common primary language for repositories using this topic is Python (25).
Stargazers over time for topic data-pipelines
Most starred repositories for topic data-pipelines (view more)
Trending repositories for topic data-pipelines (view more)
Preswald is a framework for building and deploying interactive data apps, internal tools, and dashboards with Python. With one command, you can launch, share, and deploy locally or in the cloud, turni...
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
An orchestration platform for the development, production, and observation of data assets.
Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
Preswald is a framework for building and deploying interactive data apps, internal tools, and dashboards with Python. With one command, you can launch, share, and deploy locally or in the cloud, turni...
dbt package that is part of Elementary, the dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service wit...
Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Preswald is a framework for building and deploying interactive data apps, internal tools, and dashboards with Python. With one command, you can launch, share, and deploy locally or in the cloud, turni...
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
An orchestration platform for the development, production, and observation of data assets.
Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
Preswald is a framework for building and deploying interactive data apps, internal tools, and dashboards with Python. With one command, you can launch, share, and deploy locally or in the cloud, turni...
Learn the basics of Apache Kafka® from leaders in the Kafka community with these video courses covering the Kafka ecosystem and hands-on exercises.
Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
Preswald is a framework for building and deploying interactive data apps, internal tools, and dashboards with Python. With one command, you can launch, share, and deploy locally or in the cloud, turni...
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
An orchestration platform for the development, production, and observation of data assets.
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
Preswald is a framework for building and deploying interactive data apps, internal tools, and dashboards with Python. With one command, you can launch, share, and deploy locally or in the cloud, turni...
Learn the basics of Apache Kafka® from leaders in the Kafka community with these video courses covering the Kafka ecosystem and hands-on exercises.
Main repo including core data model, data marts, data quality tests, and terminology sets.
Preswald is a framework for building and deploying interactive data apps, internal tools, and dashboards with Python. With one command, you can launch, share, and deploy locally or in the cloud, turni...
Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.
A configuration-driven framework for building Dagster pipelines that enables teams to create and manage data workflows using YAML/JSON instead of code
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
An orchestration platform for the development, production, and observation of data assets.
Preswald is a framework for building and deploying interactive data apps, internal tools, and dashboards with Python. With one command, you can launch, share, and deploy locally or in the cloud, turni...
Visual Data Transformation and Data Preparation. Low-Code Python-based ETL.
Build data pipelines with SQL and Python, ingest data from different sources, add quality checks, and build end-to-end flows.
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
A high-performance, extremely flexible, and easily extensible universal workflow engine.