109 results found Sort:
- Filter by Primary Language:
- Python (46)
- Rust (15)
- Go (11)
- Jupyter Notebook (8)
- Shell (6)
- C++ (4)
- Java (3)
- JavaScript (3)
- C# (2)
- TypeScript (2)
- R (1)
- Roff (1)
- HTML (1)
- Scala (1)
- Elixir (1)
- Swift (1)
- Kotlin (1)
- +
:zap: From finding text to search and replace, from sorting to beautifying text and more :art:
This repository has been archived
(exclude archived)
Created
2017-03-25
364 commits to master branch, last one 6 months ago
Diff Match Patch is a high-performance library in multiple languages that manipulates plain text.
This repository has been archived
(exclude archived)
Created
2018-01-23
44 commits to master branch, last one 5 years ago
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
Created
2012-10-06
2,718 commits to main branch, last one 3 days ago
Intuitive find & replace CLI (sed alternative)
Created
2018-12-23
320 commits to master branch, last one 7 months ago
fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.
Created
2018-03-07
2,484 commits to master branch, last one 2 years ago
Python library for creating PEG parsers
Created
2017-05-14
1,355 commits to master branch, last one 6 days ago
🦛 CHONK your texts with Chonkie ✨ - The no-nonsense RAG chunking library
Created
2024-11-01
323 commits to main branch, last one 14 days ago
Text Classification Algorithms: A Survey
deep-learning
random-forest
decision-trees
text-processing
rocchio-algorithm
boosting-algorithms
deep-belief-network
deep-neural-network
logistic-regression
text-classification
k-nearest-neighbours
nlp-machine-learning
naive-bayes-classifier
document-classification
support-vector-machines
dimensionality-reduction
conditional-random-fields
recurrent-neural-networks
convolutional-neural-networks
hierarchical-attention-networks
Created
2018-07-06
230 commits to master branch, last one 2 months ago
Persian NLP Toolkit
Created
2013-10-29
1,411 commits to master branch, last one 6 months ago
Program to convert lines of text into a tree structure.
Created
2020-04-30
70 commits to master branch, last one 3 years ago
The most accurate natural language detection library for Go, suitable for short text and mixed-language text
Created
2020-11-27
108 commits to main branch, last one 4 days ago
A fast implementation of Aho-Corasick in Rust.
Created
2015-06-11
293 commits to master branch, last one 2 months ago
Thai natural language processing in Python
Created
2016-06-23
4,833 commits to dev branch, last one 4 days ago
A fast and convenient fuzzy matcher library for rust
Created
2023-07-27
114 commits to master branch, last one 6 days ago
A sharp cut(1) clone.
Created
2021-06-24
197 commits to master branch, last one 18 days ago
Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashta...
Created
2017-02-07
77 commits to master branch, last one 2 years ago
A simple Python module for parsing human names into their individual components
Created
2014-04-02
409 commits to master branch, last one about a year ago
Natural language detection library for Go
Created
2017-02-20
34 commits to master branch, last one 5 years ago
All-in-one text de-duplication
Created
2021-03-13
363 commits to main branch, last one 7 months ago
Open Korean Text Processor - An Open-source Korean Text Processor
Created
2017-01-24
799 commits to master branch, last one 9 months ago
Text Normalization & Inverse Text Normalization
Created
2022-08-23
192 commits to master branch, last one about a month ago
PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks su...
Created
2010-07-06
2,160 commits to master branch, last one about a year ago
pyarabic
Created
2014-02-17
160 commits to master branch, last one 11 months ago
SQL Syntax without any database
Created
2018-01-29
536 commits to master branch, last one 3 days ago
Automatic Korean word spacing with Python
Created
2018-04-19
87 commits to master branch, last one 5 months ago
🗣️ Tool to generate adversarial text examples and test machine learning models against them
Created
2018-08-08
15 commits to master branch, last one 6 years ago
Turn PDFs and EPUBs into audiobooks, subtitles or videos into dubbed videos (including translation), and more. For free. Pandrator uses local models, notably XTTS, including voice-cloning (instant, RV...
Created
2024-03-20
315 commits to main branch, last one about a month ago
A low level regular expression library that uses deterministic finite automata.
This repository has been archived
(exclude archived)
Created
2019-01-04
95 commits to master branch, last one about a year ago
Pure-Python Japanese character interconverter for Hiragana, Katakana, Hankaku, and Zenkaku
Created
2016-04-02
132 commits to master branch, last one 4 months ago
Fast and portable character string processing in R (with the Unicode ICU)
Created
2013-01-05
1,684 commits to master branch, last one 5 months ago