Statistics for topic data-pipelines

RepositoryStats tracks 607,673 Github repositories, of these 57 are tagged with the data-pipelines topic. The most common primary language for repositories using this topic is Python (23).

Stargazers over time for topic data-pipelines

Most starred repositories for topic data-pipelines (view more)

14.6k
38.4k
apache-2.0
767
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Created 2015-04-13
27,793 commits to main branch, last one 20 hours ago
2.8k
29.7k
apache-2.0
164
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
Created 2023-12-12
2,134 commits to main branch, last one 23 hours ago
4.7k
13.2k
apache-2.0
328
Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
Created 2019-03-01
8,594 commits to dev branch, last one a day ago
292
12.6k
other
45
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
Created 2022-11-27
1,030 commits to main branch, last one 22 hours ago
1.6k
12.4k
apache-2.0
125
An orchestration platform for the development, production, and observation of data assets.
Created 2018-04-30
22,029 commits to master branch, last one 18 hours ago
826
9.9k
apache-2.0
66
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Created 2022-09-26
1,666 commits to main branch, last one a day ago

Trending repositories for topic data-pipelines (view more)