32 results found Sort:

7.2k
39.6k
apache-2.0
688
ClickHouseยฎ is a real-time analytics database management system
Created 2016-06-02
170,512 commits to master branch, last one 19 hours ago
5.4k
16.3k
apache-2.0
851
The official home of the Presto distributed SQL query engine for big data
Created 2012-08-09
24,008 commits to master branch, last one a day ago
3.4k
13.3k
apache-2.0
285
Apache Doris is an easy-to-use, high performance and unified analytics database.
Created 2017-08-10
24,862 commits to master branch, last one 19 hours ago
1.9k
9.7k
apache-2.0
189
The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance for...
Created 2021-09-04
20,120 commits to main branch, last one a day ago
770
8.3k
other
96
๐——๐—ฎ๐˜๐—ฎ, ๐—”๐—ป๐—ฎ๐—น๐˜†๐˜๐—ถ๐—ฐ๐˜€ & ๐—”๐—œ. Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. https://databend.com
Created 2020-10-10
32,551 commits to main branch, last one a day ago
403
2.6k
apache-2.0
251
LakeSoul is an end-to-end, realtime and cloud native Lakehouse framework with fast data ingestion, concurrent update and incremental data analytics on cloud storages for both BI and AI applications.
Created 2021-12-28
1,183 commits to main branch, last one 6 days ago
327
2.2k
apache-2.0
49
ByConity is an open source cloud data warehouse
Created 2022-12-22
72,498 commits to master branch, last one 19 days ago
146
2.0k
apache-2.0
42
YTsaurus is a scalable and fault-tolerant open-source big data platform.
Created 2022-12-05
79,562 commits to main branch, last one 19 hours ago
416
1.4k
apache-2.0
33
World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
Created 2023-04-23
2,364 commits to main branch, last one 23 hours ago
Postgres-Native Data Warehouse
Created 2024-09-05
97 commits to main branch, last one a day ago
320
936
apache-2.0
38
Apache Amoro (incubating) is a Lakehouse management system built on open data lake formats.
Created 2022-07-14
1,555 commits to master branch, last one 2 days ago
52
534
apache-2.0
7
Fastest open-source tool for replicating Databases to Apache Iceberg or Data Lakehouse. โšก Efficient, quick and scalable data ingestion for real-time analytics. Supporting Postgres, MongoDB and MySQL ...
Created 2024-10-15
196 commits to master branch, last one 5 days ago
21
517
postgresql
5
DuckDB-powered data lake analytics from Postgres
Created 2024-05-09
121 commits to dev branch, last one 13 days ago
32
493
apache-2.0
5
Lakekeeper is an Apache-Licensed, secure, fast and easy to use Apache Iceberg REST Catalog written in Rust.
Created 2024-04-05
646 commits to main branch, last one 24 hours ago
28
285
apache-2.0
11
Use SQL to build ELT pipelines on a data lakehouse.
Created 2021-03-11
481 commits to main branch, last one 2 years ago
A curated list of open source tools used in analytics platforms and data engineering ecosystem
Created 2024-02-22
23 commits to main branch, last one 7 days ago
Examples of using Terraform to deploy Databricks resources
Created 2022-06-10
184 commits to main branch, last one about a month ago
82
240
apache-2.0
11
A modern data marketplace that makes collaboration among diverse users (like business, analysts and engineers) easier, increasing efficiency and agility in data projects on AWS.
Created 2022-03-08
1,171 commits to main branch, last one 25 days ago
43
235
apache-2.0
18
The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for several lakehouse algorithms, data flows and utilities for Data Prod...
Created 2022-11-11
20 commits to master branch, last one about a month ago
52
224
other
18
AI ๆ—ถไปฃ็š„ๆ™บ่ƒฝๆ•ฐๆฎๅบ“
Created 2019-07-16
221 commits to master branch, last one about a year ago
33
212
other
11
The open-source, AI-native data stack
Created 2024-09-10
685 commits to main branch, last one a day ago
19
163
apache-2.0
12
Pure Rust Iceberg Implementation
This repository has been archived (exclude archived)
Created 2023-06-15
185 commits to main branch, last one 7 months ago
8
155
apache-2.0
7
Unified storage framework for the entire machine learning lifecycle
Created 2023-12-15
99 commits to main branch, last one about a year ago
13
141
apache-2.0
2
A data framework for biology.
Created 2022-04-15
4,044 commits to main branch, last one 18 hours ago
9
71
apache-2.0
2
Lakehouse storage system benchmark
Created 2022-12-15
42 commits to main branch, last one 2 years ago
Modern serverless lakehouse implementing HOOK methodology, Unified Star Schema (USS), and Analytical Data Storage System (ADSS) principles on Adventure Works. Features programmatic model generation, e...
Created 2025-02-01
215 commits to main branch, last one a day ago
A curated list of awesome Online Analytical Processing databases, frameworks, ressources and other awesomeness.
Created 2023-08-27
6 commits to main branch, last one 2 months ago
Sample Data Lakehouse deployed in Docker containers using Apache Iceberg, Minio, Trino and a Hive Metastore. Can be used for local testing.
Created 2023-04-04
15 commits to main branch, last one about a year ago
Accelerator to build a Microsoft Fabric modern data platform using pre-built reusable Fabric items and an orchestration ELT Framework
Created 2024-04-20
156 commits to main branch, last one 5 days ago
Creation of a data lakehouse and an ELT pipeline to enable the efficient analysis and use of data
Created 2022-05-13
10 commits to master branch, last one about a year ago