Trending repositories for topic airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
基于 Apache Airflow 的微信智能应用编排框架,通过可视化工作流驱动 AI 与数据自动化任务。支持 智能客服(多轮对话/知识库)、AI 图文/短视频生成、智能提醒等应用,灵活扩展多模态交互与大模型能力。
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
Dynamically generate Apache Airflow DAGs from YAML configuration files
The goal of this project is to build a docker cluster that gives access to Hadoop, HDFS, Hive, PySpark, Sqoop, Airflow, Kafka, Flume, Postgres, Cassandra, Hue, Zeppelin, Kadmin, Kafka Control Center ...
Run your dbt Core projects as Apache Airflow DAGs and Task Groups with a few lines of code
🌀 𝗧𝗵𝗲 𝗙𝘂𝗹𝗹 𝗦𝘁𝗮𝗰𝗸 𝟳-𝗦𝘁𝗲𝗽𝘀 𝗠𝗟𝗢𝗽𝘀 𝗙𝗿𝗮𝗺𝗲𝘄𝗼𝗿𝗸 | 𝗟𝗲𝗮𝗿𝗻 𝗠𝗟𝗘 & 𝗠𝗟𝗢𝗽𝘀 for free by designing, building and deploying an end-to-end ML batch system ~ 𝘴𝘰𝘶𝘳𝘤𝘦 𝘤...
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualizati...
基于 Apache Airflow 的微信智能应用编排框架,通过可视化工作流驱动 AI 与数据自动化任务。支持 智能客服(多轮对话/知识库)、AI 图文/短视频生成、智能提醒等应用,灵活扩展多模态交互与大模型能力。
The goal of this project is to build a docker cluster that gives access to Hadoop, HDFS, Hive, PySpark, Sqoop, Airflow, Kafka, Flume, Postgres, Cassandra, Hue, Zeppelin, Kadmin, Kafka Control Center ...
Dynamically generate Apache Airflow DAGs from YAML configuration files
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
Run your dbt Core projects as Apache Airflow DAGs and Task Groups with a few lines of code
🌀 𝗧𝗵𝗲 𝗙𝘂𝗹𝗹 𝗦𝘁𝗮𝗰𝗸 𝟳-𝗦𝘁𝗲𝗽𝘀 𝗠𝗟𝗢𝗽𝘀 𝗙𝗿𝗮𝗺𝗲𝘄𝗼𝗿𝗸 | 𝗟𝗲𝗮𝗿𝗻 𝗠𝗟𝗘 & 𝗠𝗟𝗢𝗽𝘀 for free by designing, building and deploying an end-to-end ML batch system ~ 𝘴𝘰𝘶𝘳𝘤𝘦 𝘤...
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualizati...
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
Run your dbt Core projects as Apache Airflow DAGs and Task Groups with a few lines of code
基于 Apache Airflow 的微信智能应用编排框架,通过可视化工作流驱动 AI 与数据自动化任务。支持 智能客服(多轮对话/知识库)、AI 图文/短视频生成、智能提醒等应用,灵活扩展多模态交互与大模型能力。
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
A series of DAGs/Workflows to help maintain the operation of Airflow
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualizati...
The goal of this project is to build a docker cluster that gives access to Hadoop, HDFS, Hive, PySpark, Sqoop, Airflow, Kafka, Flume, Postgres, Cassandra, Hue, Zeppelin, Kadmin, Kafka Control Center ...
This is a repository to demonstrate my details, skills, projects and to keep track of my progression in Data Analytics and Data Science topics.
Projects done in the Data Engineer Nanodegree Program by Udacity.com
User friendly and open source platform for workflow creation and monitoring
Resources for video demonstrations and blog posts related to DataOps on AWS
A CLI tool to streamline getting started with Apache Airflow™ and managing multiple Airflow projects
Source code of the Apache Airflow Tutorial for Beginners on YouTube Channel Coder2j (https://www.youtube.com/c/coder2j)
基于 Apache Airflow 的微信智能应用编排框架,通过可视化工作流驱动 AI 与数据自动化任务。支持 智能客服(多轮对话/知识库)、AI 图文/短视频生成、智能提醒等应用,灵活扩展多模态交互与大模型能力。
The goal of this project is to build a docker cluster that gives access to Hadoop, HDFS, Hive, PySpark, Sqoop, Airflow, Kafka, Flume, Postgres, Cassandra, Hue, Zeppelin, Kadmin, Kafka Control Center ...
This is a repository to demonstrate my details, skills, projects and to keep track of my progression in Data Analytics and Data Science topics.
Run your dbt Core projects as Apache Airflow DAGs and Task Groups with a few lines of code
Projects done in the Data Engineer Nanodegree Program by Udacity.com
User friendly and open source platform for workflow creation and monitoring
Resources for video demonstrations and blog posts related to DataOps on AWS
A CLI tool to streamline getting started with Apache Airflow™ and managing multiple Airflow projects
Source code of the Apache Airflow Tutorial for Beginners on YouTube Channel Coder2j (https://www.youtube.com/c/coder2j)
Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
A series of DAGs/Workflows to help maintain the operation of Airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
More than 2000+ Data engineer interview questions.
Run your dbt Core projects as Apache Airflow DAGs and Task Groups with a few lines of code
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualizati...
基于 Apache Airflow 的微信智能应用编排框架,通过可视化工作流驱动 AI 与数据自动化任务。支持 智能客服(多轮对话/知识库)、AI 图文/短视频生成、智能提醒等应用,灵活扩展多模态交互与大模型能力。
This is a repository to demonstrate my details, skills, projects and to keep track of my progression in Data Analytics and Data Science topics.
Projects done in the Data Engineer Nanodegree Program by Udacity.com
Dynamically generate Apache Airflow DAGs from YAML configuration files
Source code of the Apache Airflow Tutorial for Beginners on YouTube Channel Coder2j (https://www.youtube.com/c/coder2j)
The User-Community Airflow Helm Chart is the standard way to deploy Apache Airflow on Kubernetes with Helm. Originally created in 2017, it has since helped thousands of companies create production-rea...
A series of DAGs/Workflows to help maintain the operation of Airflow
TorchX is a universal job launcher for PyTorch applications. TorchX is designed to have fast iteration time for training/research and support for E2E production ML pipelines when you're ready.
User friendly and open source platform for workflow creation and monitoring
基于 Apache Airflow 的微信智能应用编排框架,通过可视化工作流驱动 AI 与数据自动化任务。支持 智能客服(多轮对话/知识库)、AI 图文/短视频生成、智能提醒等应用,灵活扩展多模态交互与大模型能力。
📈 A scalable, production-ready data pipeline for real-time streaming & batch processing, integrating Kafka, Spark, Airflow, AWS, Kubernetes, and MLflow. Supports end-to-end data ingestion, transforma...
This is a repository to demonstrate my details, skills, projects and to keep track of my progression in Data Analytics and Data Science topics.
Detailed notes and homeworks from 2025 Data Engineering Zoomcamp by Datatalks.Club
Projects done in the Data Engineer Nanodegree Program by Udacity.com
Produce Kafka messages, consume them and upload into Cassandra, MongoDB.
This repository contains code snippets, steps and other artifacts used in the youtube videos in the demo. You can use this to get access to the code or artifacts.
User friendly and open source platform for workflow creation and monitoring
The goal of this project is to build a docker cluster that gives access to Hadoop, HDFS, Hive, PySpark, Sqoop, Airflow, Kafka, Flume, Postgres, Cassandra, Hue, Zeppelin, Kadmin, Kafka Control Center ...
Run your dbt Core projects as Apache Airflow DAGs and Task Groups with a few lines of code
Source code of the Apache Airflow Tutorial for Beginners on YouTube Channel Coder2j (https://www.youtube.com/c/coder2j)
HashiQube - The Ultimate Hands on DevOps Lab running All the HashiCorp Products in a Github Codespace or a Docker Container using Vagrant or Docker Compose
🐳 Проектная деятельность. Здесь хранятся лекции, практические задания и проекты с karpov_courses. Ссылка: https://karpov.courses/
More than 2000+ Data engineer interview questions.
Data Foundation - Google Cloud Cortex Framework
TorchX is a universal job launcher for PyTorch applications. TorchX is designed to have fast iteration time for training/research and support for E2E production ML pipelines when you're ready.
사용자가 채팅웹을 통해 자신이 처한 법률적 상황을 제시하면, 입력에 대한 문맥을 모델이 이해하여 가이드라인을 제시하고, 유사한 상황의 판례를 제공하는 웹 서비스입니다. (2023.08.18 서비스 종료)
Arquitetura CRM de Baixo Custo com Gen AI, projetada para startups que precisam processar e analisar dados de vendas de forma eficiente.
基于 Apache Airflow 的微信智能应用编排框架,通过可视化工作流驱动 AI 与数据自动化任务。支持 智能客服(多轮对话/知识库)、AI 图文/短视频生成、智能提醒等应用,灵活扩展多模态交互与大模型能力。
Detailed notes and homeworks from 2025 Data Engineering Zoomcamp by Datatalks.Club
End-to-end data platform: A PoC Data Platform project utilizing modern data stack (Spark, Airflow, DBT, Trino, Lightdash, Hive metastore, Minio, Postgres)
Integrating Airbyte, Kafka, Airflow and MLflow on Azure Linux VMs within private network to continuously retrain LSTM Attention model with 1-minute stock prices and redeploy it on Azure ML AKS real-ti...
📈 A scalable, production-ready data pipeline for real-time streaming & batch processing, integrating Kafka, Spark, Airflow, AWS, Kubernetes, and MLflow. Supports end-to-end data ingestion, transforma...
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
More than 2000+ Data engineer interview questions.
Run your dbt Core projects as Apache Airflow DAGs and Task Groups with a few lines of code
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
A data engineering project with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more!
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualizati...
Dynamically generate Apache Airflow DAGs from YAML configuration files
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
A series of DAGs/Workflows to help maintain the operation of Airflow
🌀 𝗧𝗵𝗲 𝗙𝘂𝗹𝗹 𝗦𝘁𝗮𝗰𝗸 𝟳-𝗦𝘁𝗲𝗽𝘀 𝗠𝗟𝗢𝗽𝘀 𝗙𝗿𝗮𝗺𝗲𝘄𝗼𝗿𝗸 | 𝗟𝗲𝗮𝗿𝗻 𝗠𝗟𝗘 & 𝗠𝗟𝗢𝗽𝘀 for free by designing, building and deploying an end-to-end ML batch system ~ 𝘴𝘰𝘶𝘳𝘤𝘦 𝘤...
This is a repository to demonstrate my details, skills, projects and to keep track of my progression in Data Analytics and Data Science topics.
A portable Datamart and Business Intelligence suite built with Docker, Airflow, dbt, PostgreSQL and Superset
Dockerized monitoring stack for Apache Airflow
This repository serves as a comprehensive guide to effective data modeling and robust data quality assurance using popular open-source tools
A Python package that creates fine-grained dbt tasks on Apache Airflow
Full-stack Highly Scalable Cloud-native Machine Learning system for demand forecasting with realtime data streaming, inference, retraining loop, and more
End-to-end data platform leveraging the Modern data stack
User friendly and open source platform for workflow creation and monitoring
Run your dbt Core projects as Apache Airflow DAGs and Task Groups with a few lines of code
Playground for Lakehouse (Iceberg, Hudi, Spark, Flink, Trino, DBT, Airflow, Kafka, Debezium CDC)
Projects done in the Data Engineer Nanodegree Program by Udacity.com
Arquitetura CRM de Baixo Custo com Gen AI, projetada para startups que precisam processar e analisar dados de vendas de forma eficiente.
HashiQube - The Ultimate Hands on DevOps Lab running All the HashiCorp Products in a Github Codespace or a Docker Container using Vagrant or Docker Compose
Built a real-time streaming pipeline to extract stock data, using Apache Nifi, Debezium, Kafka, and Spark Streaming. Loaded the transformed data into Glue database and created real-time dashboards usi...
📈 A scalable, production-ready data pipeline for real-time streaming & batch processing, integrating Kafka, Spark, Airflow, AWS, Kubernetes, and MLflow. Supports end-to-end data ingestion, transforma...