97 results found Sort:

150
3.7k
other
27
Commandline tool for running SQL queries against JSON, CSV, Excel, Parquet, and more.
Created 2022-01-10
102 commits to main branch, last one about a year ago
179
3.2k
apache-2.0
43
Create full-fledged APIs for slowly moving datasets without writing a single line of code.
Created 2020-12-11
266 commits to main branch, last one 28 days ago
1.4k
2.6k
apache-2.0
93
Apache Parquet Java
Created 2014-06-10
2,707 commits to master branch, last one 24 hours ago
799
2.6k
apache-2.0
52
Official Rust implementation of Apache Arrow
Created 2021-04-17
6,136 commits to master branch, last one 13 hours ago
71
2.5k
unlicense
17
Blazing-fast Data-Wrangling toolkit
Created 2020-12-11
10,911 commits to master branch, last one 19 hours ago
979
1.9k
apache-2.0
153
Apache Drill is a distributed MPP query layer for self describing data
Created 2012-09-05
4,523 commits to master branch, last one 2 days ago
431
1.8k
apache-2.0
67
Apache Parquet Format
Created 2014-06-10
389 commits to master branch, last one 8 days ago
284
1.8k
apache-2.0
40
Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, a...
Created 2018-06-15
691 commits to master branch, last one 11 months ago
354
1.8k
apache-2.0
138
A large-scale entity and relation database supporting aggregation of properties
Created 2015-12-14
7,298 commits to develop branch, last one 19 days ago
117
1.7k
apache-2.0
11
Rill is a tool for effortlessly transforming data sets into powerful, opinionated dashboards using SQL. BI-as-code.
Created 2021-12-09
3,953 commits to main branch, last one 4 hours ago
90
1.3k
apache-2.0
18
Quilt is a data mesh for connecting people with actionable data
Created 2017-02-10
4,858 commits to master branch, last one 21 hours ago
109
1.2k
apache-2.0
9
cryo is the easiest way to extract blockchain data to parquet, csv, json, or python dataframes
Created 2023-06-27
417 commits to main branch, last one 6 months ago
18
1.1k
agpl-3.0
6
Postgres read replica optimized for analytics
Created 2024-11-04
49 commits to main branch, last one 19 hours ago
308
1.0k
apache-2.0
99
ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark, and Apache Parquet. Apache 2 licensed.
Created 2013-11-19
2,033 commits to master branch, last one 29 days ago
135
802
mit
49
ETL framework for .NET (Parser / Writer for CSV, Flat, Xml, JSON, Key-Value, Parquet, Yaml, Avro formatted files)
Created 2016-10-19
1,019 commits to master branch, last one 7 days ago
46
787
apache-2.0
14
A portable embedded database using Arrow.
Created 2024-07-15
293 commits to main branch, last one 20 hours ago
97
780
gpl-3.0
11
Simple Windows desktop application for viewing & querying Apache Parquet files
Created 2018-05-31
365 commits to master branch, last one 2 months ago
80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Functions, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML...
Created 2015-10-27
5,403 commits to master branch, last one about a month ago
A Python library for fast, interactive geospatial vector data visualization in Jupyter.
Created 2023-08-31
395 commits to main branch, last one 9 days ago
66
581
mit
22
Graph Data Science: an abstraction layer in Python for building knowledge graphs, integrated with popular graph libraries – atop Pandas, NetworkX, RAPIDS, RDFlib, pySHACL, PyVis, morph-kgc, pslpython,...
Created 2020-10-25
724 commits to main branch, last one 6 months ago
101
564
apache-2.0
38
Fast data store for Pandas time-series data
Created 2018-05-26
210 commits to main branch, last one 4 months ago
Data Preview 🈸 extension for importing 📤 viewing 🔎 slicing 🔪 dicing 🎲 charting 📊 & exporting 📥 large JSON array/config, YAML, Apache Arrow, Avro, Parquet & Excel data files
Created 2018-12-12
696 commits to master branch, last one about a year ago
19
525
apache-2.0
6
Rust-based WebAssembly bindings to read and write Apache Parquet data
Created 2022-02-27
350 commits to main branch, last one 22 days ago
60
479
apache-2.0
351
Iceberg is a table format for large, slow-moving tabular data
Created 2017-12-13
278 commits to master branch, last one 5 years ago
15
456
other
3
Query anything (JSON, Salesforce, GitHub, etc.) with SQL and visualize your data with any MySQL-compatible BI tool.
Created 2024-04-06
296 commits to main branch, last one 20 hours ago
192
443
apache-2.0
49
Apache Parquet
This repository has been archived (exclude archived)
Created 2014-06-10
503 commits to master branch, last one 6 months ago
A tool for batch loading data files (json, parquet, csv, tsv) into ElasticSearch
Created 2016-09-17
166 commits to master branch, last one 2 years ago
15
382
postgresql
2
DuckDB-powered analytics for Postgres
Created 2024-05-09
83 commits to dev branch, last one 2 days ago
Copy to/from Parquet in S3 from within PostgreSQL
Created 2024-09-04
42 commits to main branch, last one 9 days ago
Fastest and safest Rust implementation of parquet. `unsafe` free. Integration-tested against pyarrow
Created 2021-03-27
272 commits to main branch, last one about a year ago