Statistics for topic data-pipelines

RepositoryStats tracks 635,088 Github repositories, of these 60 are tagged with the data-pipelines topic. The most common primary language for repositories using this topic is Python (25).

Stargazers over time for topic data-pipelines

Most starred repositories for topic data-pipelines (view more)

airflow apache

14.8k

39.5k

apache-2.0

764

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

Created 2015-04-13

29,332 commits to main branch, last one 7 hours ago

pathway pathwaycom

348

23.7k

other

Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.

etl rust kafka python pathway dataflow real-time streaming etl-framework iot-analytics data-analytics data-pipelines data-processing batch-processing stream-processing time-series-analysis machine-learning-algorithms

Created 2022-11-27

1,227 commits to main branch, last one 9 hours ago

dolphinscheduler apache

4.7k

13.4k

apache-2.0

325

Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code

airflow azkaban workflow cloud-native job-scheduler orchestration data-pipelines task-scheduler workflow-schedule workflow-orchestration powerful-data-pipelines

Created 2019-03-01

8,628 commits to dev branch, last one a day ago

dagster dagster-io

1.6k

12.9k

apache-2.0

123

An orchestration platform for the development, production, and observation of data assets.

etl mlops python dagster metadata workflow analytics scheduler data-science orchestration data-pipelines data-engineering data-integration data-orchestrator workflow-automation

Created 2018-04-30

23,206 commits to master branch, last one 7 hours ago

unstructured Unstructured-IO

894

10.8k

apache-2.0

Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.

Created 2022-09-26

1,712 commits to main branch, last one 23 hours ago

mage-ai mage-ai

828

8.2k

apache-2.0

🧙 Build, run, and manage data pipelines for integrating and transforming data.

dbt elt etl sql data spark python pipeline pipelines reverse-etl data-science orchestration data-pipelines transformation data-engineering data-integration machine-learning artificial-intelligence

Created 2022-05-16

5,573 commits to master branch, last one 2 days ago

Statistics for topic data-pipelines

Stargazers over time for topic data-pipelines

Most starred repositories for topic data-pipelines (view more)

Trending repositories for topic data-pipelines (view more)