43 results found Sort:
- Filter by Primary Language:
- R (11)
- Jupyter Notebook (9)
- Python (5)
- C# (2)
- HTML (2)
- TypeScript (2)
- Java (1)
- Perl (1)
- Go (1)
- C++ (1)
- Rust (1)
- Shell (1)
- Tcl (1)
- Julia (1)
- +
OpenRefine is a free, open source power tool for working with messy data and improving it
Created
2012-10-15
8,136 commits to master branch, last one 3 days ago
Select, put and delete data from JSON, TOML, YAML, XML and CSV files with a single tool. Supports conversion between formats and can be used as a Go package.
Created
2020-09-22
697 commits to master branch, last one 15 days ago
A Collection of Cheatsheets, Books, Questions, and Portfolio For DS/ML Interview Prep
Created
2018-08-09
518 commits to master branch, last one 3 years ago
Carefully curated resource links for data science in one place
Created
2018-12-27
112 commits to master branch, last one about a year ago
CSVs sliced, diced & analyzed.
Created
2020-12-11
9,111 commits to master branch, last one 15 hours ago
A Python toolbox for gaining geometric insights into high-dimensional data
Created
2016-09-27
1,652 commits to master branch, last one 2 months ago
Zui is a powerful desktop application for exploring and working with data. The official front-end to the Zed lake.
Created
2018-08-09
5,351 commits to main branch, last one 7 days ago
:truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
Created
2017-07-13
6,411 commits to develop branch, last one about a year ago
The JavaScript data transformation and analysis toolkit inspired by Pandas and LINQ.
Created
2017-11-27
509 commits to master branch, last one 2 months ago
Prepping tables for machine learning
Created
2018-03-12
1,541 commits to main branch, last one 2 days ago
Statistical Inference via Data Science: A ModernDive into R and the Tidyverse
Created
2016-08-29
2,186 commits to master branch, last one 7 days ago
Microsoft Program Synthesis using Examples SDK is a framework of technologies for the automatic generation of programs from input-output examples. This repo includes samples and sample data for the Mi...
Created
2015-10-21
224 commits to main branch, last one 2 months ago
Materials for following along with Hands-On Data Analysis with Pandas – Second Edition
Created
2020-08-24
628 commits to master branch, last one 6 months ago
Materials for following along with Hands-On Data Analysis with Pandas.
Created
2018-09-15
228 commits to master branch, last one about a year ago
Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algor...
data-mining
correlations
data-science
spreadsheets
tabular-data
data-cleaning
data-analytics
data-cleansing
data-profiling
data-wrangling
data-engineering
data-exploration
anomaly-detection
feature-selection
data-preprocessing
feature-extraction
feature-engineering
knowledge-discovery
data-mining-algorithms
exploratory-data-analysis
Created
2020-04-09
1,194 commits to main branch, last one a day ago
An introductory workshop on pandas with notebooks and exercises for following along.
Created
2021-05-15
265 commits to main branch, last one about a month ago
Like awk but with SQL and table joins
Created
2015-01-16
239 commits to master branch, last one 21 days ago
Pacote que trata e organiza os dados do Cadastro Nacional da Pessoa Jurídica (CNPJ)
Created
2019-03-26
77 commits to master branch, last one 3 years ago
Data Analysis and Visualization in R for Ecologists - the version at https://github.com/datacarpentry/R-ecology-lesson-alternative will be merged on 8th July 2024
Created
2015-04-02
1,238 commits to main branch, last one 2 months ago
Tools for test driven data-wrangling and data validation.
Created
2016-05-12
2,173 commits to master branch, last one 2 years ago
Data Cleaning Libraries with Python
Created
2017-04-24
13 commits to master branch, last one 5 years ago
Catmandu - a data processing toolkit
Created
2010-07-01
2,479 commits to dev branch, last one 6 months ago
R for Reproducible Scientific Analysis
Created
2015-04-18
1,597 commits to main branch, last one 19 days ago
Plotting and Programming in Python
Created
2016-01-07
1,244 commits to main branch, last one about a month ago
Data Analysis and Visualization in Python for Ecologists
Created
2015-03-19
1,166 commits to main branch, last one 2 months ago
Programming with R
Created
2014-12-18
1,386 commits to main branch, last one 24 days ago
Data transformation and utility functions for R
Created
2015-03-21
952 commits to master branch, last one 18 days ago
Springboard Program: Data Science Career Track - NLP
Created
2018-10-12
467 commits to master branch, last one 3 years ago
D-Lab's 12 hour introduction to R Fundamentals. Learn how to create variables and functions, manipulate data frames, make visualizations, use control flow structures, and more, using R in RStudio.
Created
2016-11-14
280 commits to main branch, last one about a year ago
CSV Lint plug-in for Notepad++ for syntax highlighting, csv validation, automatic column and datatype detecting, fixed width datasets, change datetime format, decimal separator, sort data, count uniqu...
Created
2019-12-15
331 commits to master branch, last one 2 months ago