Trending repositories for topic airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
🌀 𝗧𝗵𝗲 𝗙𝘂𝗹𝗹 𝗦𝘁𝗮𝗰𝗸 𝟳-𝗦𝘁𝗲𝗽𝘀 𝗠𝗟𝗢𝗽𝘀 𝗙𝗿𝗮𝗺𝗲𝘄𝗼𝗿𝗸 | 𝗟𝗲𝗮𝗿𝗻 𝗠𝗟𝗘 & 𝗠𝗟𝗢𝗽𝘀 for free by designing, building and deploying an end-to-end ML batch system ~ 𝘴𝘰𝘶𝘳𝘤𝘦 𝘤...
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualizati...
🌀 𝗧𝗵𝗲 𝗙𝘂𝗹𝗹 𝗦𝘁𝗮𝗰𝗸 𝟳-𝗦𝘁𝗲𝗽𝘀 𝗠𝗟𝗢𝗽𝘀 𝗙𝗿𝗮𝗺𝗲𝘄𝗼𝗿𝗸 | 𝗟𝗲𝗮𝗿𝗻 𝗠𝗟𝗘 & 𝗠𝗟𝗢𝗽𝘀 for free by designing, building and deploying an end-to-end ML batch system ~ 𝘴𝘰𝘶𝘳𝘤𝘦 𝘤...
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualizati...
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
More than 2000+ Data engineer interview questions.
TorchX is a universal job launcher for PyTorch applications. TorchX is designed to have fast iteration time for training/research and support for E2E production ML pipelines when you're ready.
A data engineering project with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more!
Run your dbt Core projects as Apache Airflow DAGs and Task Groups with a few lines of code
Cloud-native, data onboarding architecture for Google Cloud Datasets
🐳 Проектная деятельность. Здесь хранятся лекции, практические задания и проекты с karpov_courses. Ссылка: https://karpov.courses/
Source code of the Apache Airflow Tutorial for Beginners on YouTube Channel Coder2j (https://www.youtube.com/c/coder2j)
🌀 𝗧𝗵𝗲 𝗙𝘂𝗹𝗹 𝗦𝘁𝗮𝗰𝗸 𝟳-𝗦𝘁𝗲𝗽𝘀 𝗠𝗟𝗢𝗽𝘀 𝗙𝗿𝗮𝗺𝗲𝘄𝗼𝗿𝗸 | 𝗟𝗲𝗮𝗿𝗻 𝗠𝗟𝗘 & 𝗠𝗟𝗢𝗽𝘀 for free by designing, building and deploying an end-to-end ML batch system ~ 𝘴𝘰𝘶𝘳𝘤𝘦 𝘤...
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Cloud-native, data onboarding architecture for Google Cloud Datasets
🐳 Проектная деятельность. Здесь хранятся лекции, практические задания и проекты с karpov_courses. Ссылка: https://karpov.courses/
TorchX is a universal job launcher for PyTorch applications. TorchX is designed to have fast iteration time for training/research and support for E2E production ML pipelines when you're ready.
Source code of the Apache Airflow Tutorial for Beginners on YouTube Channel Coder2j (https://www.youtube.com/c/coder2j)
More than 2000+ Data engineer interview questions.
A data engineering project with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more!
Run your dbt Core projects as Apache Airflow DAGs and Task Groups with a few lines of code
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
🌀 𝗧𝗵𝗲 𝗙𝘂𝗹𝗹 𝗦𝘁𝗮𝗰𝗸 𝟳-𝗦𝘁𝗲𝗽𝘀 𝗠𝗟𝗢𝗽𝘀 𝗙𝗿𝗮𝗺𝗲𝘄𝗼𝗿𝗸 | 𝗟𝗲𝗮𝗿𝗻 𝗠𝗟𝗘 & 𝗠𝗟𝗢𝗽𝘀 for free by designing, building and deploying an end-to-end ML batch system ~ 𝘴𝘰𝘶𝘳𝘤𝘦 𝘤...
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
More than 2000+ Data engineer interview questions.
Run your dbt Core projects as Apache Airflow DAGs and Task Groups with a few lines of code
A data engineering project with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more!
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
🌀 𝗧𝗵𝗲 𝗙𝘂𝗹𝗹 𝗦𝘁𝗮𝗰𝗸 𝟳-𝗦𝘁𝗲𝗽𝘀 𝗠𝗟𝗢𝗽𝘀 𝗙𝗿𝗮𝗺𝗲𝘄𝗼𝗿𝗸 | 𝗟𝗲𝗮𝗿𝗻 𝗠𝗟𝗘 & 𝗠𝗟𝗢𝗽𝘀 for free by designing, building and deploying an end-to-end ML batch system ~ 𝘴𝘰𝘶𝘳𝘤𝘦 𝘤...
The User-Community Airflow Helm Chart is the standard way to deploy Apache Airflow on Kubernetes with Helm. Originally created in 2017, it has since helped thousands of companies create production-rea...
User friendly and open source platform for workflow creation and monitoring
An end-to-end LLM reference implementation providing a Q&A interface for Airflow and Astronomer
Arquitetura CRM de Baixo Custo com Gen AI, projetada para startups que precisam processar e analisar dados de vendas de forma eficiente.
A self-contained, ready to run Airflow ELT project. Can be run locally or within codespaces.
Docker Airflow - Contains a docker compose file for Airflow 2.0
HashiQube - The Ultimate Hands on DevOps Lab running All the HashiCorp Products in a Github Codespace or a Docker Container using Vagrant or Docker Compose
Run your dbt Core projects as Apache Airflow DAGs and Task Groups with a few lines of code
User friendly and open source platform for workflow creation and monitoring
This is a repository to demonstrate my details, skills, projects and to keep track of my progression in Data Analytics and Data Science topics.
A data engineering project with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more!
More than 2000+ Data engineer interview questions.
Cloud-native, data onboarding architecture for Google Cloud Datasets
🐳 Проектная деятельность. Здесь хранятся лекции, практические задания и проекты с karpov_courses. Ссылка: https://karpov.courses/
An end-to-end LLM reference implementation providing a Q&A interface for Airflow and Astronomer
Pipeline that extracts data from Crinacle's Headphone and InEarMonitor databases and finalizes data for a Metabase Dashboard. The dashboard is then used to support a purchasing decision of which Headp...
Full-stack Highly Scalable Cloud-native Machine Learning system for demand forecasting with realtime data streaming, inference, retraining loop, and more
Arquitetura CRM de Baixo Custo com Gen AI, projetada para startups que precisam processar e analisar dados de vendas de forma eficiente.
Integrating Airbyte, Kafka, Airflow and MLflow on Azure Linux VMs within private network to continuously retrain LSTM Attention model with 1-minute stock prices and redeploy it on Azure ML AKS real-ti...
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
More than 2000+ Data engineer interview questions.
Run your dbt Core projects as Apache Airflow DAGs and Task Groups with a few lines of code
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualizati...
A data engineering project with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more!
Dynamically generate Apache Airflow DAGs from YAML configuration files
🌀 𝗧𝗵𝗲 𝗙𝘂𝗹𝗹 𝗦𝘁𝗮𝗰𝗸 𝟳-𝗦𝘁𝗲𝗽𝘀 𝗠𝗟𝗢𝗽𝘀 𝗙𝗿𝗮𝗺𝗲𝘄𝗼𝗿𝗸 | 𝗟𝗲𝗮𝗿𝗻 𝗠𝗟𝗘 & 𝗠𝗟𝗢𝗽𝘀 for free by designing, building and deploying an end-to-end ML batch system ~ 𝘴𝘰𝘶𝘳𝘤𝘦 𝘤...
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
A series of DAGs/Workflows to help maintain the operation of Airflow
User friendly and open source platform for workflow creation and monitoring
A Python package to submit and manage Apache Spark applications on Kubernetes.
This is a repository to demonstrate my details, skills, projects and to keep track of my progression in Data Analytics and Data Science topics.
Full-stack Highly Scalable Cloud-native Machine Learning system for demand forecasting with realtime data streaming, inference, retraining loop, and more
A Python package that creates fine-grained dbt tasks on Apache Airflow
This repository serves as a comprehensive guide to effective data modeling and robust data quality assurance using popular open-source tools
Playground for Lakehouse (Iceberg, Hudi, Spark, Flink, Trino, DBT, Airflow, Kafka, Debezium CDC)
User friendly and open source platform for workflow creation and monitoring
Data Engineering examples for Airflow, Prefect, and Mage.ai; dbt for BigQuery, Redshift, ClickHouse, PostgreSQL; Spark/PySpark for Batch processing; and Kafka for Stream processing
Built a real-time streaming pipeline to extract stock data, using Apache Nifi, Debezium, Kafka, and Spark Streaming. Loaded the transformed data into Glue database and created real-time dashboards usi...
Run your dbt Core projects as Apache Airflow DAGs and Task Groups with a few lines of code
HashiQube - The Ultimate Hands on DevOps Lab running All the HashiCorp Products in a Github Codespace or a Docker Container using Vagrant or Docker Compose
Arquitetura CRM de Baixo Custo com Gen AI, projetada para startups que precisam processar e analisar dados de vendas de forma eficiente.
The goal of this project is to build a docker cluster that gives access to Hadoop, HDFS, Hive, PySpark, Sqoop, Airflow, Kafka, Flume, Postgres, Cassandra, Hue, Zeppelin, Kadmin, Kafka Control Center ...
More than 2000+ Data engineer interview questions.