Statistics for topic data-engineering
RepositoryStats tracks 605,145 Github repositories, of these 312 are tagged with the data-engineering topic. The most common primary language for repositories using this topic is Python (135). Other languages include: Jupyter Notebook (36), Go (18), JavaScript (13), Scala (12), TypeScript (12)
Stargazers over time for topic data-engineering
Most starred repositories for topic data-engineering (view more)
Trending repositories for topic data-engineering (view more)
Apache Superset is a Data Visualization and Data Exploration Platform
Learn how to design, develop, deploy and iterate on production-grade ML applications.
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Declarative text based tool for data analysts and engineers to extract, load, transform and orchestrate their data pipelines.
Jayvee is a domain-specific language and runtime for automated processing of data pipelines
My Digital Palace - A Personal Journal for Reflection - A place to store all my thoughts
Apache Superset is a Data Visualization and Data Exploration Platform
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
An orchestration platform for the development, production, and observation of data assets.
Declarative text based tool for data analysts and engineers to extract, load, transform and orchestrate their data pipelines.
A curated list of open source tools used in analytics platforms and data engineering ecosystem
This repo contains "Databricks Certified Data Engineer Professional" Questions and related docs.
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Apache Superset is a Data Visualization and Data Exploration Platform
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
A curated collection of AI, data engineering, and DevOps projects featuring real-world applications, advanced techniques, and tutorials—ideal for learners and practitioners exploring data science and ...
Python framework for building efficient data pipelines. It promotes modularity and collaboration, enabling the creation of complex pipelines from simple, reusable components.
数据流引擎是一款面向数据集成、数据同步、数据交换、数据共享、任务配置、任务调度的底层数据驱动引擎。数据流引擎采用管执分离、多流层、插件库等体系应对大规模数据任务、数据高频上报、数据高频采集、异构数据兼容的实际数据问题。
Turns Data and AI algorithms into production-ready web applications in no time.
Apache Superset is a Data Visualization and Data Exploration Platform
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
Code for "Efficient Data Processing in Spark" Course
The data-validation toolkit for enhanced dbt (data build tool) PR review
A curated list of open source tools used in analytics platforms and data engineering ecosystem