81 results found Sort:

150
3.7k
other
27
Commandline tool for running SQL queries against JSON, CSV, Excel, Parquet, and more.
Created 2022-01-10
102 commits to main branch, last one 9 months ago
173
3.1k
apache-2.0
43
Create full-fledged APIs for slowly moving datasets without writing a single line of code.
Created 2020-12-11
258 commits to main branch, last one about a month ago
1.4k
2.5k
apache-2.0
94
Apache Parquet Java
Created 2014-06-10
2,644 commits to master branch, last one 2 days ago
65
2.3k
unlicense
13
CSVs sliced, diced & analyzed.
Created 2020-12-11
9,351 commits to master branch, last one 22 hours ago
985
1.9k
apache-2.0
154
Apache Drill is a distributed MPP query layer for self describing data
Created 2012-09-05
4,503 commits to master branch, last one 18 hours ago
281
1.8k
apache-2.0
41
Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, a...
Created 2018-06-15
691 commits to master branch, last one 6 months ago
354
1.7k
apache-2.0
140
A large-scale entity and relation database supporting aggregation of properties
Created 2015-12-14
7,222 commits to develop branch, last one 5 days ago
422
1.7k
apache-2.0
65
Apache Parquet Format
Created 2014-06-10
370 commits to master branch, last one 6 days ago
106
1.5k
apache-2.0
9
Rill is a tool for effortlessly transforming data sets into powerful, opinionated dashboards using SQL. BI-as-code.
Created 2021-12-09
3,274 commits to main branch, last one 18 hours ago
92
1.3k
apache-2.0
19
Quilt is a data mesh for connecting people with actionable data
Created 2017-02-10
4,714 commits to master branch, last one a day ago
90
1.0k
apache-2.0
8
cryo is the easiest way to extract blockchain data to parquet, csv, json, or python dataframes
Created 2023-06-27
417 commits to main branch, last one about a month ago
305
966
apache-2.0
100
ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark, and Apache Parquet. Apache 2 licensed.
Created 2013-11-19
2,028 commits to master branch, last one 5 months ago
134
752
mit
49
ETL framework for .NET (Parser / Writer for CSV, Flat, Xml, JSON, Key-Value, Parquet, Yaml, Avro formatted files)
Created 2016-10-19
1,017 commits to master branch, last one 19 days ago
80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Functions, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML...
Created 2015-10-27
5,350 commits to master branch, last one 16 hours ago
82
685
gpl-3.0
13
Simple Windows desktop application for viewing & querying Apache Parquet files
Created 2018-05-31
338 commits to master branch, last one 19 days ago
64
564
mit
20
Graph Data Science: an abstraction layer in Python for building knowledge graphs, integrated with popular graph libraries – atop Pandas, NetworkX, RAPIDS, RDFlib, pySHACL, PyVis, morph-kgc, pslpython,...
Created 2020-10-25
724 commits to main branch, last one about a month ago
97
542
apache-2.0
37
Fast data store for Pandas time-series data
Created 2018-05-26
203 commits to main branch, last one 2 years ago
Data Preview 🈸 extension for importing 📤 viewing 🔎 slicing 🔪 dicing 🎲 charting 📊 & exporting 📥 large JSON array/config, YAML, Apache Arrow, Avro, Parquet & Excel data files
Created 2018-12-12
696 commits to master branch, last one about a year ago
A Python library for fast, interactive geospatial vector data visualization in Jupyter.
Created 2023-08-31
312 commits to main branch, last one 2 days ago
19
479
apache-2.0
7
Rust-based WebAssembly bindings to read and write Apache Parquet data
Created 2022-02-27
292 commits to main branch, last one 2 days ago
59
468
apache-2.0
347
Iceberg is a table format for large, slow-moving tabular data
Created 2017-12-13
278 commits to master branch, last one 5 years ago
193
438
apache-2.0
50
Apache Parquet
This repository has been archived (exclude archived)
Created 2014-06-10
503 commits to master branch, last one about a month ago
A tool for batch loading data files (json, parquet, csv, tsv) into ElasticSearch
Created 2016-09-17
166 commits to master branch, last one 2 years ago
Fastest and safest Rust implementation of parquet. `unsafe` free. Integration-tested against pyarrow
Created 2021-03-27
272 commits to main branch, last one 9 months ago
fully asynchronous, pure JavaScript implementation of the Parquet file format
Created 2017-04-30
292 commits to master branch, last one about a month ago
55
337
apache-2.0
29
A tool for data sampling, data generation, and data diffing
Created 2016-08-01
698 commits to master branch, last one 6 days ago
96
337
apache-2.0
10
Go library to read/write Parquet files
This repository has been archived (exclude archived)
Created 2020-10-02
308 commits to main branch, last one 11 months ago
46
326
apache-2.0
22
Kotlin Bigdata Toolkit
Created 2013-10-16
432 commits to master branch, last one about a month ago
A cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, Azure Blob Storage ,etc.
Created 2020-02-05
119 commits to master branch, last one 2 months ago
Read and write Parquet in Scala. Use Scala classes as schema. No need to start a cluster.
Created 2018-08-26
477 commits to master branch, last one about a month ago