95 results found Sort:
- Filter by Primary Language:
- Python (39)
- Rust (14)
- Go (11)
- Shell (6)
- Jupyter Notebook (6)
- C++ (4)
- Java (3)
- JavaScript (3)
- TypeScript (2)
- R (1)
- Roff (1)
- HTML (1)
- Scala (1)
- Elixir (1)
- C# (1)
- +
:zap: From finding text to search and replace, from sorting to beautifying text and more :art:
This repository has been archived
(exclude archived)
Created
2017-03-25
363 commits to master branch, last one about a year ago
Diff Match Patch is a high-performance library in multiple languages that manipulates plain text.
Created
2018-01-23
44 commits to master branch, last one 4 years ago
Intuitive find & replace CLI (sed alternative)
Created
2018-12-23
320 commits to master branch, last one 8 days ago
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
Created
2012-10-06
2,400 commits to main branch, last one a day ago
fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.
Created
2018-03-07
2,484 commits to master branch, last one about a year ago
Python library for creating PEG parsers
Created
2017-05-14
1,222 commits to master branch, last one a day ago
Text Classification Algorithms: A Survey
deep-learning
random-forest
decision-trees
text-processing
rocchio-algorithm
boosting-algorithms
deep-belief-network
deep-neural-network
logistic-regression
text-classification
k-nearest-neighbours
nlp-machine-learning
naive-bayes-classifier
document-classification
support-vector-machines
dimensionality-reduction
conditional-random-fields
recurrent-neural-networks
convolutional-neural-networks
hierarchical-attention-networks
Created
2018-07-06
228 commits to master branch, last one about a year ago
Program to convert lines of text into a tree structure.
Created
2020-04-30
70 commits to master branch, last one 2 years ago
Persian NLP Toolkit
Created
2013-10-29
1,406 commits to master branch, last one 29 days ago
The most accurate natural language detection library for Go, suitable for short text and mixed-language text
Created
2020-11-27
101 commits to main branch, last one 4 months ago
A fast implementation of Aho-Corasick in Rust.
Created
2015-06-11
291 commits to master branch, last one 2 months ago
A fast and convenient fuzzy matcher library for rust
Created
2023-07-27
95 commits to master branch, last one 7 days ago
A sharp cut(1) clone.
Created
2021-06-24
191 commits to master branch, last one about a month ago
Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashta...
Created
2017-02-07
77 commits to master branch, last one about a year ago
A simple Python module for parsing human names into their individual components
Created
2014-04-02
409 commits to master branch, last one 8 months ago
Natural language detection library for Go
Created
2017-02-20
34 commits to master branch, last one 5 years ago
Open Korean Text Processor - An Open-source Korean Text Processor
Created
2017-01-24
799 commits to master branch, last one 2 months ago
All-in-one text de-duplication
Created
2021-03-13
363 commits to main branch, last one 10 days ago
PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks su...
Created
2010-07-06
2,160 commits to master branch, last one 8 months ago
pyarabic
Created
2014-02-17
160 commits to master branch, last one 4 months ago
🗣️ Tool to generate adversarial text examples and test machine learning models against them
Created
2018-08-08
15 commits to master branch, last one 5 years ago
Automatic Korean word spacing with Python
Created
2018-04-19
73 commits to master branch, last one about a month ago
Text Normalization & Inverse Text Normalization
Created
2022-08-23
142 commits to master branch, last one 17 hours ago
A low level regular expression library that uses deterministic finite automata.
This repository has been archived
(exclude archived)
Created
2019-01-04
95 commits to master branch, last one 10 months ago
Pure-Python Japanese character interconverter for Hiragana, Katakana, Hankaku, and Zenkaku
Created
2016-04-02
127 commits to master branch, last one about a year ago
Fast and portable character string processing in R (with the Unicode ICU)
Created
2013-01-05
1,682 commits to master branch, last one 24 days ago
UNIC: Unicode and Internationalization Crates for Rust
Created
2017-06-20
769 commits to master branch, last one 3 years ago
🌿 An easy-to-use Japanese Text Processing tool, which makes it possible to switch tokenizers with small changes of code.
Created
2018-08-22
431 commits to main branch, last one 16 days ago
Streamlining Text Processing
Created
2024-03-17
112 commits to main branch, last one 6 days ago
Recreated sources for the book "UNIX Text Processing," published in 1987.
Created
2020-10-18
15 commits to main branch, last one 3 years ago