26 results found Sort:

End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow
Created 2016-11-13
266 commits to master branch, last one 3 years ago
120
1.4k
bsd-3-clause
20
Machine learning for dataframes
Created 2018-03-12
1,773 commits to main branch, last one a day ago
187
613
apache-2.0
19
Open source project for data preparation of LLM application builders
Created 2024-04-08
5,166 commits to dev branch, last one 4 days ago
Machine Learning library for the web and Node.
Created 2018-04-29
811 commits to master branch, last one 7 days ago
55
509
mit
4
Easy to use Python library of customized functions for cleaning and analyzing data.
Created 2020-03-25
887 commits to main branch, last one 3 months ago
Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algor...
Created 2020-04-09
1,677 commits to main branch, last one 3 days ago
A dynamic, scalable AI chatbot built with Django REST framework, supporting custom training from PDFs, documents, websites, and YouTube videos. Leveraging OpenAI's GPT-3.5, Pinecone, FAISS, and Celery...
Created 2023-01-21
542 commits to main branch, last one about a year ago
Deal with bad samples in your dataset dynamically, use Transforms as Filters, and more!
Created 2018-10-05
30 commits to master branch, last one 3 years ago
37
133
gpl-3.0
11
Social Media Mining Toolkit (SMMT) main repository
Created 2020-02-05
106 commits to master branch, last one 2 years ago
The Triton backend that allows running GPU-accelerated data pre-processing pipelines implemented in DALI's python API.
Created 2020-09-09
162 commits to main branch, last one 19 days ago
Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby
Created 2019-07-20
174 commits to master branch, last one 3 years ago
A time series signal analysis and classification framework
Created 2019-03-25
79 commits to master branch, last one 4 years ago
Resources of our survey paper "Optimizing Edge AI: A Comprehensive Survey on Data, Model, and System Strategies"
Created 2023-01-19
125 commits to main branch, last one 3 months ago
A quantitative study on over 1.25 million tweets about ChatGPT, employed data scrapping, data cleaning, EDA, topic modeling, and sentiment analysis.
Created 2023-03-02
41 commits to main branch, last one about a year ago
20
51
bsd-3-clause
2
Learn2Clean: Optimizing the Sequence of Tasks for Data Preparation and Cleaning
Created 2019-03-29
96 commits to master branch, last one 4 years ago
A Python library for Automated Exploratory Data Analysis, Automated Data Cleaning, and Automated Data Preprocessing For Machine Learning and Natural Language Processing Applications in Python.
Created 2021-03-14
71 commits to master branch, last one 2 years ago
This repository has no description...
Created 2023-11-13
144 commits to main branch, last one about a year ago
GWAS summary statistics files QC tool
Created 2021-09-22
68 commits to master branch, last one 3 months ago
This project focuses on data preprocessing and epilepsy seizure prediction using the CHB-MIT EEG dataset. It includes steps like data cleansing, feature extraction, and handling imbalanced datasets, a...
Created 2023-11-22
2 commits to main branch, last one about a year ago
Data stream analytics: Implement online learning methods to address concept drift and model drift in dynamic data streams. Code for the paper entitled "A Multi-Stage Automated Online Network Data Stre...
Created 2022-10-01
26 commits to main branch, last one 2 years ago
The objective of this assignment is to extract textual data articles from the given URL and perform text analysis to compute variables that are explained
Created 2023-01-13
3 commits to main branch, last one 2 years ago