31 results found Sort:
- Filter by Primary Language:
- Python (5)
- TypeScript (4)
- R (2)
- Go (2)
- Scala (2)
- C# (2)
- Java (2)
- C (2)
- Tcl (1)
- JavaScript (1)
- Clojure (1)
- Elixir (1)
- HTML (1)
- Apex (1)
- Objective-C (1)
- PHP (1)
- Ruby (1)
- +
☄️ Python's nested data operator (and CLI), for all your declarative restructuring needs. Got data? Glom it! ☄️
Created
2018-04-18
1,027 commits to master branch, last one 4 months ago
:truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
Created
2017-07-13
6,411 commits to develop branch, last one about a year ago
Logical Replication extension for PostgreSQL 15, 14, 13, 12, 11, 10, 9.6, 9.5, 9.4 (Postgres), providing much faster replication than Slony, Bucardo or Londiste, as well as cross-version upgrades.
Created
2016-05-10
846 commits to REL2_x_STABLE branch, last one 3 days ago
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
Created
2021-08-25
1,818 commits to main branch, last one 18 days ago
A block-based API for NSValueTransformer, with a growing collection of useful examples.
Created
2012-11-09
141 commits to master branch, last one 4 years ago
Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data quality management.
Created
2021-03-22
487 commits to main branch, last one 10 months ago
Microsoft Program Synthesis using Examples SDK is a framework of technologies for the automatic generation of programs from input-output examples. This repo includes samples and sample data for the Mi...
Created
2015-10-21
224 commits to main branch, last one 2 months ago
:lipstick: Durable and asynchronous data imports for consuming data at scale and publishing testable SDKs.
Created
2016-01-17
257 commits to master branch, last one 10 months ago
Advanced and Fast Data Transformation in R
Created
2019-02-27
2,903 commits to master branch, last one a day ago
Like awk but with SQL and table joins
Created
2015-01-16
239 commits to master branch, last one 21 days ago
Low-code Python library to safely use notebooks in production: schedule workflows, generate assets, trigger webhooks, send notifications, build pipelines, manage secrets (Cloud-only)
Created
2020-09-20
1,908 commits to main branch, last one 24 days ago
📄 Concise selector to extract JSON from HTML.
Created
2017-06-14
348 commits to master branch, last one 3 years ago
An Extensible Suite of High-Performance and Low-Dependency Packages for Statistical Computing and Data Manipulation in R
Created
2021-03-11
292 commits to main branch, last one 2 days ago
O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian
Created
2019-12-10
479 commits to master branch, last one 11 months ago
A simple Spark-powered ETL framework that just works 🍺
Created
2019-12-20
627 commits to master branch, last one about a year ago
A curated list of Clojure resources for dealing with domain-specific languages.
Created
2021-01-03
48 commits to master branch, last one 19 days ago
Data transformation and utility functions for R
Created
2015-03-21
952 commits to master branch, last one 18 days ago
Clojure Query: A Command-line Data Processor for JSON, YAML, EDN, XML and more
Created
2021-03-19
110 commits to main branch, last one 3 months ago
Big Data Modeling, MapReduce, Spark, PySpark @ Santa Clara University
Created
2014-12-04
868 commits to master branch, last one 11 days ago
🤖 An automated machine learning framework for audio, text, image, video, or .CSV files (50+ featurizers and 15+ model trainers). Python 3.6 required.
Created
2019-05-26
3,418 commits to master branch, last one 8 months ago
A visual data pipeline builder with various backends
Created
2019-01-23
2,744 commits to master branch, last one 2 days ago
A schema-aware Scala library for data transformation
Created
2021-02-05
544 commits to master branch, last one 3 months ago
Wrangler Transform: A DMD system for transforming Big Data
Created
2016-11-27
1,517 commits to develop branch, last one 25 days ago
Data transformation toolkit
Created
2019-08-18
834 commits to main branch, last one 4 months ago
breadroll 🥟 is a simple lightweight library for data processing operations written in Typescript and powered by Bun.
Created
2023-06-02
162 commits to main branch, last one 3 months ago
Examples for working with DataWeave scripts from Apex.
This repository has been archived
(exclude archived)
Created
2021-11-18
71 commits to main branch, last one 3 months ago
Daany - .NET DAta ANalYtics .NET library with the implementation of DataFrame, Time series decompositions and Linear Algebra routines BLASS and LAPACK.
Created
2019-09-22
395 commits to master branch, last one 6 months ago
object flow treatment, data transformation
Created
2016-05-15
1,682 commits to master branch, last one 14 days ago
⚡️ Next-generation data transformation framework for TypeScript that puts developer experience first
Created
2022-03-23
66 commits to main branch, last one 2 years ago
Bruin is a data pipeline tool that is designed to be easy-to-use. It allows building data pipelines using SQL and Python, and has built-in data quality checks.
Created
2023-08-03
425 commits to main branch, last one 23 hours ago