Search Results - RepositoryStats

waggle-dance ExpediaGroup

78

281

apache-2.0

20

Hive federation service. Enables disparate tables to be concurrently accessed across multiple Hive deployments.

hive metastore federation hive-metastore oss-portal-listed

Created 2017-07-17

516 commits to main branch, last one about a month ago

circus-train ExpediaGroup

15

88

apache-2.0

18

Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.

s3 hive big-data bigquery hive-table replication hive-metastore replicate-data

Created 2017-11-17

267 commits to main branch, last one 2 years ago

hive-metastore naushadh

27

71

apache-2.0

1

Apache Hive Metastore as a Standalone server in Docker

spark trino docker presto localstack hive-metastore

Created 2022-09-09

23 commits to main branch, last one 7 months ago

Local-Data-LakeHouse dominikhei

13

63

unknown

4

Sample Data Lakehouse deployed in Docker containers using Apache Iceberg, Minio, Trino and a Hive Metastore. Can be used for local testing.

minio trino data-lake lakehouse apache-iceberg data-lakehouse hive-metastore

Created 2023-04-04

15 commits to main branch, last one about a year ago

hive-metastore-client quintoandar

15

54

apache-2.0

201

A client for connecting and running DDLs on hive metastore.

etl ddls hive python package metastore hive-metastore data-engineering hive-metastore-client

Created 2020-11-19

66 commits to main branch, last one 2 years ago

beekeeper ExpediaGroup

7

46

apache-2.0

9

Service for automatically managing and cleaning up unreferenced data

s3 hive java cleanup big-data metastore maintenance hive-metastore oss-portal-featured

Created 2019-08-06

301 commits to main branch, last one 11 days ago

apache-spark-docker Wittline

27

43

apache-2.0

5

Dockerizing an Apache Spark Standalone Cluster

hue hdfs hive docker pyspark apache-spark dataengineer hadoop-docker docker-compose hadoop-cluster hive-metastore dataengineering

Created 2021-07-19

35 commits to main branch, last one 2 years ago

lasagna gmrqs

13

43

unknown

1

A Docker Compose template that builds a interactive development environment for PySpark with Jupyter Lab, MinIO as object storage, Hive Metastore, Trino and Kafka

minio spark trino docker jupyter pyspark jupyterlab docker-compose hive-metastore spark-streaming

Created 2023-02-22

40 commits to main branch, last one 3 months ago

e2e-data-platform thanhENC

7

34

mit

3

End-to-end data platform: A PoC Data Platform project utilizing modern data stack (Spark, Airflow, DBT, Trino, Lightdash, Hive metastore, Minio, Postgres)

dbt spark trino airflow lightdash delta-lake end-to-end data-pipeline data-platform adventureworks docker-compose hive-metastore

Created 2024-08-08

108 commits to main branch, last one 5 months ago