Statistics for topic data-engineering
RepositoryStats tracks 584,796 Github repositories, of these 294 are tagged with the data-engineering topic. The most common primary language for repositories using this topic is Python (128). Other languages include: Jupyter Notebook (35), Go (17), JavaScript (12), TypeScript (11)
Stargazers over time for topic data-engineering
Most starred repositories for topic data-engineering (view more)
Trending repositories for topic data-engineering (view more)
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Apache Superset is a Data Visualization and Data Exploration Platform
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
Turns Data and AI algorithms into production-ready web applications in no time.
Learn how to design, develop, deploy and iterate on production-grade ML applications.
数据流引擎是一款面向数据集成、数据同步、数据交换、数据共享、任务配置、任务调度的底层数据驱动引擎。数据流引擎采用管执分离、多流层、插件库等体系应对大规模数据任务、数据高频上报、数据高频采集、异构数据兼容的实际数据问题。
🥪🦘 An open source sandbox project exploring dbt workflows via a fictional sandwich shop's data.
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Apache Superset is a Data Visualization and Data Exploration Platform
Turns Data and AI algorithms into production-ready web applications in no time.
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
数据流引擎是一款面向数据集成、数据同步、数据交换、数据共享、任务配置、任务调度的底层数据驱动引擎。数据流引擎采用管执分离、多流层、插件库等体系应对大规模数据任务、数据高频上报、数据高频采集、异构数据兼容的实际数据问题。
Collection of Snowflake Notebook demos, tutorials, and examples
Declarative text based tool for data analysts and engineers to extract, load, transform and orchestrate their data pipelines.
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
Turns Data and AI algorithms into production-ready web applications in no time.
Apache Superset is a Data Visualization and Data Exploration Platform
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Learn how to design, develop, deploy and iterate on production-grade ML applications.
Materials for the Deploy and Monitor ML Pipelines with Python, Docker and GitHub Actions workshop at the PyData NYC 2024 conference
数据流引擎是一款面向数据集成、数据同步、数据交换、数据共享、任务配置、任务调度的底层数据驱动引擎。数据流引擎采用管执分离、多流层、插件库等体系应对大规模数据任务、数据高频上报、数据高频采集、异构数据兼容的实际数据问题。
A curated list of open source tools used in analytics platforms and data engineering ecosystem
Data Engineering Project with Hadoop HDFS and Kafka
Un repositorio más con conceptos básicos, desafíos técnicos y recursos sobre ingeniería de datos en español 🧙✨
Python framework for building efficient data pipelines. It promotes modularity and collaboration, enabling the creation of complex pipelines from simple, reusable components.
数据流引擎是一款面向数据集成、数据同步、数据交换、数据共享、任务配置、任务调度的底层数据驱动引擎。数据流引擎采用管执分离、多流层、插件库等体系应对大规模数据任务、数据高频上报、数据高频采集、异构数据兼容的实际数据问题。
Code for "Efficient Data Processing in Spark" Course
Turns Data and AI algorithms into production-ready web applications in no time.
Apache Superset is a Data Visualization and Data Exploration Platform
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
Code for "Efficient Data Processing in Spark" Course
Turns Data and AI algorithms into production-ready web applications in no time.
🥪🦘 An open source sandbox project exploring dbt workflows via a fictional sandwich shop's data.