62 results found Sort:

2.3k
23.3k
apache-2.0
133
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
Created 2023-12-12
1,659 commits to main branch, last one 22 hours ago
762
9.2k
apache-2.0
60
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Created 2022-09-26
1,620 commits to main branch, last one 18 hours ago
403
3.4k
apache-2.0
34
中文 NLP 预处理、解析工具包,准确、高效、易用 A Chinese NLP Preprocessing & Parsing Package www.jionlp.com
Created 2020-03-13
540 commits to master branch, last one 24 days ago
172
3.1k
mit
65
a delightful machine learning tool that allows you to train, test, and use models without writing code
Created 2020-08-27
429 commits to master branch, last one about a year ago
333
1.9k
mit
50
An ultra-fast all-in-one FASTQ preprocessor (QC/adapters/trimming/filtering/splitting/merging...)
Created 2017-10-31
459 commits to master branch, last one 13 days ago
274
1.5k
other
65
MLBox is a powerful Automated Machine Learning python library.
Created 2017-06-01
1,121 commits to master branch, last one 4 years ago
100
1.1k
mit
22
Automated Time Series Forecasting
Created 2019-11-26
950 commits to master branch, last one 2 days ago
143
1.1k
apache-2.0
34
NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.
Created 2020-04-03
1,075 commits to main branch, last one 3 months ago
Audio processing by using pytorch 1D convolution network
Created 2019-09-02
305 commits to master branch, last one 9 months ago
A Deep Learning Python Toolkit for Healthcare Applications.
Created 2020-08-03
742 commits to master branch, last one 6 months ago
296
878
mit
68
Collection of various algorithms implemented in R.
Created 2018-09-23
287 commits to master branch, last one 7 days ago
79
649
bsd-3-clause
24
High performance model preprocessing library on PyTorch
This repository has been archived (exclude archived)
Created 2021-09-27
439 commits to main branch, last one about a year ago
✔️Contextual word checker for better suggestions (not actively maintained)
Created 2020-04-10
183 commits to master branch, last one about a year ago
Deal with bad samples in your dataset dynamically, use Transforms as Filters, and more!
Created 2018-10-05
30 commits to master branch, last one 3 years ago
103
377
unknown
41
A curated list of awesome CAE frameworks, libraries and software.
Created 2016-08-21
36 commits to master branch, last one 3 months ago
Pure-Python Japanese character interconverter for Hiragana, Katakana, Hankaku, and Zenkaku
Created 2016-04-02
132 commits to master branch, last one 3 months ago
48
298
apache-2.0
19
Cylon is a fast, scalable, distributed memory, parallel runtime with a Pandas like DataFrame.
Created 2019-10-10
1,307 commits to main branch, last one 11 months ago
20
274
apache-2.0
8
Japanese text normalizer for mecab-neologd
Created 2015-07-21
103 commits to master branch, last one 6 months ago
Just some tool repackers like to use...
This repository has been archived (exclude archived)
Created 2022-01-09
29 commits to main branch, last one about a year ago
32
222
apache-2.0
13
[WIP] VoiceSmith makes training text to speech models easy.
Created 2022-05-17
238 commits to main branch, last one 2 years ago
44
198
apache-2.0
10
Analysis ready CMIP6 data in python the easy way with pangeo tools.
Created 2019-10-16
615 commits to main branch, last one 4 months ago
TFRecorder makes it easy to create TensorFlow records (TFRecords) from Pandas DataFrames and CSVs files containing images or structured data.
This repository has been archived (exclude archived)
Created 2020-07-24
128 commits to main branch, last one 3 years ago
This is the preprocessing step of the LIDC-IDRI dataset
Created 2020-04-24
30 commits to master branch, last one 4 years ago
50
155
gpl-3.0
15
An "R" package for automatic download and preprocessing of MODIS Land Products Time Series
Created 2014-07-09
1,619 commits to master branch, last one 4 months ago
The deslanting algorithm sets text upright in images. Python, C++ and OpenCL implementations provided.
Created 2018-01-25
34 commits to master branch, last one 3 years ago
25
140
lgpl-3.0
18
Dataflow Programming for Machine Learning in R
Created 2017-10-10
3,246 commits to master branch, last one a day ago
PyPREP: A Python implementation of the Preprocessing Pipeline (PREP) for EEG data
Created 2018-04-12
387 commits to main branch, last one 2 days ago
58
138
bsd-3-clause
12
Automated rejection and repair of bad trials/sensors in M/EEG
Created 2016-05-23
565 commits to main branch, last one a day ago
Mambular is a Python package that brings the power of Mamba architectures to tabular data, offering a suite of deep learning models for regression, classification, and distributional regression tasks....
Created 2024-05-03
422 commits to master branch, last one about a month ago