Trending repositories for topic airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
Pipeline that extracts data from Crinacle's Headphone and InEarMonitor databases and finalizes data for a Metabase Dashboard.
TorchX is a universal job launcher for PyTorch applications. TorchX is designed to have fast iteration time for training/research and support for E2E production ML pipelines when you're ready.
A plugin for Apache Airflow that allows you to edit DAGs in browser
More than 2000+ Data engineer interview questions.
Pipeline that extracts data from Crinacle's Headphone and InEarMonitor databases and finalizes data for a Metabase Dashboard.
Airflow plugin to export dag and task based metrics to Prometheus.
TorchX is a universal job launcher for PyTorch applications. TorchX is designed to have fast iteration time for training/research and support for E2E production ML pipelines when you're ready.
A plugin for Apache Airflow that allows you to edit DAGs in browser
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
More than 2000+ Data engineer interview questions.
Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
More than 2000+ Data engineer interview questions.
Run your dbt Core projects as Apache Airflow DAGs and Task Groups with a few lines of code
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
HashiQube - The Ultimate Hands on DevOps Lab running All the HashiCorp Products in a Github Codespace or a Docker Container using Vagrant or Docker Compose
TorchX is a universal job launcher for PyTorch applications. TorchX is designed to have fast iteration time for training/research and support for E2E production ML pipelines when you're ready.
Dynamically generate Apache Airflow DAGs from YAML configuration files
A plugin for Apache Airflow that allows you to edit DAGs in browser
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualizati...
The User-Community Airflow Helm Chart is the standard way to deploy Apache Airflow on Kubernetes with Helm. Originally created in 2017, it has since helped thousands of companies create production-rea...
The goal of this project is to build a docker cluster that gives access to Hadoop, HDFS, Hive, PySpark, Sqoop, Airflow, Kafka, Flume, Postgres, Cassandra, Hue, Zeppelin, Kadmin, Kafka Control Center ...
HashiQube - The Ultimate Hands on DevOps Lab running All the HashiCorp Products in a Github Codespace or a Docker Container using Vagrant or Docker Compose
The goal of this project is to build a docker cluster that gives access to Hadoop, HDFS, Hive, PySpark, Sqoop, Airflow, Kafka, Flume, Postgres, Cassandra, Hue, Zeppelin, Kadmin, Kafka Control Center ...
TorchX is a universal job launcher for PyTorch applications. TorchX is designed to have fast iteration time for training/research and support for E2E production ML pipelines when you're ready.
Projects done in the Data Engineer Nanodegree Program by Udacity.com
Run your dbt Core projects as Apache Airflow DAGs and Task Groups with a few lines of code
More than 2000+ Data engineer interview questions.
A plugin for Apache Airflow that allows you to edit DAGs in browser
Cloud-native, data onboarding architecture for Google Cloud Datasets
Data Foundation - Google Cloud Cortex Framework
Pipeline that extracts data from Crinacle's Headphone and InEarMonitor databases and finalizes data for a Metabase Dashboard.
Airflow plugin to export dag and task based metrics to Prometheus.
Dynamically generate Apache Airflow DAGs from YAML configuration files
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
The User-Community Airflow Helm Chart is the standard way to deploy Apache Airflow on Kubernetes with Helm. Originally created in 2017, it has since helped thousands of companies create production-rea...
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
More than 2000+ Data engineer interview questions.
Run your dbt Core projects as Apache Airflow DAGs and Task Groups with a few lines of code
Dynamically generate Apache Airflow DAGs from YAML configuration files
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
A data engineering project with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more!
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualizati...
TorchX is a universal job launcher for PyTorch applications. TorchX is designed to have fast iteration time for training/research and support for E2E production ML pipelines when you're ready.
A series of DAGs/Workflows to help maintain the operation of Airflow
Integrating Airbyte, Kafka, Airflow and MLflow on Azure Linux VMs within private network to periodically retrain LSTM Attention model with 1-minute stock prices and redeploy it on Azure ML AKS real-ti...
HashiQube - The Ultimate Hands on DevOps Lab running All the HashiCorp Products in a Github Codespace or a Docker Container using Vagrant or Docker Compose
A self-contained, ready to run Airflow ELT project. Can be run locally or within codespaces.
Run your dbt Core projects as Apache Airflow DAGs and Task Groups with a few lines of code
Playground for Lakehouse (Iceberg, Hudi, Spark, Flink, Trino, DBT, Airflow, Kafka, Debezium CDC)
사용자가 채팅웹을 통해 자신이 처한 법률적 상황을 제시하면, 입력에 대한 문맥을 모델이 이해하여 가이드라인을 제시하고, 유사한 상황의 판례를 제공하는 웹 서비스입니다. (2023.08.18 서비스 종료)
More than 2000+ Data engineer interview questions.
TorchX is a universal job launcher for PyTorch applications. TorchX is designed to have fast iteration time for training/research and support for E2E production ML pipelines when you're ready.
A data engineering project with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more!
User friendly and open source platform for workflow creation and monitoring
Dynamically generate Apache Airflow DAGs from YAML configuration files
Full-stack Highly Scalable Cloud-native Machine Learning system for demand forecasting with realtime data streaming, inference, retraining loop, and more
Full-stack Highly Scalable Cloud-native Machine Learning system for demand forecasting with realtime data streaming, inference, retraining loop, and more
Produce Kafka messages, consume them and upload into Cassandra, MongoDB.
A Python package to submit and manage Apache Spark applications on Kubernetes.
Integrating Airbyte, Kafka, Airflow and MLflow on Azure Linux VMs within private network to periodically retrain LSTM Attention model with 1-minute stock prices and redeploy it on Azure ML AKS real-ti...
Project was based on an interest in Data Engineering, ETL pipeline. It also provided a good opportunity to develop skills and experience in a range of tools. As such, project is more complex than req...
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
More than 2000+ Data engineer interview questions.
🌀 𝗧𝗵𝗲 𝗙𝘂𝗹𝗹 𝗦𝘁𝗮𝗰𝗸 𝟳-𝗦𝘁𝗲𝗽𝘀 𝗠𝗟𝗢𝗽𝘀 𝗙𝗿𝗮𝗺𝗲𝘄𝗼𝗿𝗸 | 𝗟𝗲𝗮𝗿𝗻 𝗠𝗟𝗘 & 𝗠𝗟𝗢𝗽𝘀 for free by designing, building and deploying an end-to-end ML batch system ~ 𝘴𝘰𝘶𝘳𝘤𝘦 𝘤...
Run your dbt Core projects as Apache Airflow DAGs and Task Groups with a few lines of code
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualizati...
Dynamically generate Apache Airflow DAGs from YAML configuration files
An end-to-end LLM reference implementation providing a Q&A interface for Airflow and Astronomer
A data engineering project with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more!
A series of DAGs/Workflows to help maintain the operation of Airflow
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
User friendly and open source platform for workflow creation and monitoring
An end-to-end LLM reference implementation providing a Q&A interface for Airflow and Astronomer
Playground for Lakehouse (Iceberg, Hudi, Spark, Flink, Trino, DBT, Airflow, Kafka, Debezium CDC)
A Python package that creates fine-grained dbt tasks on Apache Airflow
User friendly and open source platform for workflow creation and monitoring
Project was based on an interest in Data Engineering, ETL pipeline. It also provided a good opportunity to develop skills and experience in a range of tools. As such, project is more complex than req...
Create a streaming data, transfer it to Kafka, modify it with PySpark, take it to ElasticSearch and MinIO
Writes the CSV file to Postgres, read table and modify it. Write more tables to Postgres with Airflow.
A Python package to submit and manage Apache Spark applications on Kubernetes.
The goal of this project is to build a docker cluster that gives access to Hadoop, HDFS, Hive, PySpark, Sqoop, Airflow, Kafka, Flume, Postgres, Cassandra, Hue, Zeppelin, Kadmin, Kafka Control Center ...
Get data from API, run a scheduled script with Airflow, send data to Kafka and consume with Spark, then write to Cassandra
Full-stack Highly Scalable Cloud-native Machine Learning system for demand forecasting with realtime data streaming, inference, retraining loop, and more
HashiQube - The Ultimate Hands on DevOps Lab running All the HashiCorp Products in a Github Codespace or a Docker Container using Vagrant or Docker Compose
사용자가 채팅웹을 통해 자신이 처한 법률적 상황을 제시하면, 입력에 대한 문맥을 모델이 이해하여 가이드라인을 제시하고, 유사한 상황의 판례를 제공하는 웹 서비스입니다. (2023.08.18 서비스 종료)
Run your dbt Core projects as Apache Airflow DAGs and Task Groups with a few lines of code
A self-contained, ready to run Airflow ELT project. Can be run locally or within codespaces.
An example of how to deploy Apache Airflow on Amazon ECS Fargate