95 results found Sort:

:zap: From finding text to search and replace, from sorting to beautifying text and more :art:
This repository has been archived (exclude archived)
Created 2017-03-25
363 commits to master branch, last one about a year ago
1.1k
7.2k
apache-2.0
117
Diff Match Patch is a high-performance library in multiple languages that manipulates plain text.
Created 2018-01-23
44 commits to master branch, last one 4 years ago
134
5.5k
mit
26
Intuitive find & replace CLI (sed alternative)
Created 2018-12-23
320 commits to master branch, last one 8 days ago
426
4.3k
agpl-3.0
56
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
Created 2012-10-06
2,400 commits to main branch, last one a day ago
451
3.0k
apache-2.0
82
fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.
Created 2018-03-07
2,484 commits to master branch, last one about a year ago
275
2.1k
mit
27
Python library for creating PEG parsers
Created 2017-05-14
1,222 commits to master branch, last one a day ago
Program to convert lines of text into a tree structure.
Created 2020-04-30
70 commits to master branch, last one 2 years ago
Persian NLP Toolkit
Created 2013-10-29
1,406 commits to master branch, last one 29 days ago
64
1.1k
apache-2.0
11
The most accurate natural language detection library for Go, suitable for short text and mixed-language text
Created 2020-11-27
101 commits to main branch, last one 4 months ago
91
961
unlicense
19
A fast implementation of Aho-Corasick in Rust.
Created 2015-06-11
291 commits to master branch, last one 2 months ago
25
734
mpl-2.0
18
A fast and convenient fuzzy matcher library for rust
Created 2023-07-27
95 commits to master branch, last one 7 days ago
18
685
unlicense
7
A sharp cut(1) clone.
Created 2021-06-24
191 commits to master branch, last one about a month ago
Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashta...
Created 2017-02-07
77 commits to master branch, last one about a year ago
A simple Python module for parsing human names into their individual components
Created 2014-04-02
409 commits to master branch, last one 8 months ago
Natural language detection library for Go
Created 2017-02-20
34 commits to master branch, last one 5 years ago
Open Korean Text Processor - An Open-source Korean Text Processor
Created 2017-01-24
799 commits to master branch, last one 2 months ago
66
517
apache-2.0
4
All-in-one text de-duplication
Created 2021-03-13
363 commits to main branch, last one 10 days ago
67
477
gpl-3.0
32
PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks su...
Created 2010-07-06
2,160 commits to master branch, last one 8 months ago
84
425
gpl-3.0
36
pyarabic
Created 2014-02-17
160 commits to master branch, last one 4 months ago
🗣️ Tool to generate adversarial text examples and test machine learning models against them
Created 2018-08-08
15 commits to master branch, last one 5 years ago
115
372
gpl-3.0
9
Automatic Korean word spacing with Python
Created 2018-04-19
73 commits to master branch, last one about a month ago
59
371
apache-2.0
10
Text Normalization & Inverse Text Normalization
Created 2022-08-23
142 commits to master branch, last one 17 hours ago
A low level regular expression library that uses deterministic finite automata.
This repository has been archived (exclude archived)
Created 2019-01-04
95 commits to master branch, last one 10 months ago
Pure-Python Japanese character interconverter for Hiragana, Katakana, Hankaku, and Zenkaku
Created 2016-04-02
127 commits to master branch, last one about a year ago
45
293
other
21
Fast and portable character string processing in R (with the Unicode ICU)
Created 2013-01-05
1,682 commits to master branch, last one 24 days ago
24
234
other
17
UNIC: Unicode and Internationalization Crates for Rust
Created 2017-06-20
769 commits to master branch, last one 3 years ago
21
217
mit
7
🌿 An easy-to-use Japanese Text Processing tool, which makes it possible to switch tokenizers with small changes of code.
Created 2018-08-22
431 commits to main branch, last one 16 days ago
5
206
mit
2
Streamlining Text Processing
Created 2024-03-17
112 commits to main branch, last one 6 days ago
Recreated sources for the book "UNIX Text Processing," published in 1987.
Created 2020-10-18
15 commits to main branch, last one 3 years ago