35 results found Sort:

790
1.1k
unknown
51
Starter code to solve real world text data problems. Includes: Gensim Word2Vec, phrase embeddings, Text Classification with Logistic Regression, word count with pyspark, simple text preprocessing, pre...
Created 2018-01-28
120 commits to master branch, last one 3 years ago
Fuzzy string matching, grouping, and evaluation.
Created 2020-11-21
26 commits to master branch, last one 8 months ago
Machine learning movie recommending system
Created 2018-01-23
98 commits to master branch, last one 2 months ago
45
450
mit
24
Selected Machine Learning algorithms for natural language processing and semantic analysis in Golang
Created 2017-03-15
145 commits to master branch, last one 3 years ago
36
291
other
10
Text2Text Language Modeling Toolkit
Created 2020-03-01
271 commits to master branch, last one 23 days ago
63
214
mit
22
Vietnamese NLP Toolkit for Node
Created 2016-09-01
140 commits to master branch, last one 8 months ago
Natural Language Processing (NLP) library for Crystal
Created 2018-03-11
170 commits to master branch, last one 4 years ago
This repository consists of a complete guide on natural language processing (NLP) in Python where we'll learn various techniques for implementing NLP including parsing & text processing and understand...
Created 2021-11-08
19 commits to main branch, last one 2 years ago
26
193
mit
8
Text vectorization tool to outperform TFIDF for classification tasks
Created 2018-04-12
74 commits to master branch, last one 10 months ago
22
186
mit
10
A Python Search Engine for Humans 🥸
Created 2022-11-16
44 commits to main branch, last one 12 months ago
IResearch is a cross-platform, high-performance search analytics library written entirely in C++ with the focus on a pluggability of different ranking/similarity models
This repository has been archived (exclude archived)
Created 2016-10-21
3,023 commits to master branch, last one 6 months ago
Implementation with some extensions of the paper "Snowball: Extracting Relations from Large Plain-Text Collections" (Agichtein and Gravano, 2000)
Created 2015-09-05
87 commits to master branch, last one 2 months ago
23
164
apache-2.0
15
Stringlifier is on Opensource ML Library for detecting random strings in raw text. It can be used in sanitising logs, detecting accidentally exposed credentials and as a pre-processing step in unsuper...
Created 2020-06-26
52 commits to main branch, last one 3 years ago
Arabic Open Domain Question Answering System using Neural Reading Comprehension
Created 2019-05-30
28 commits to master branch, last one about a year ago
11
148
apache-2.0
5
Simple NLP in Rust with Python bindings
Created 2018-11-05
143 commits to main branch, last one 4 years ago
商品类目预测,使用 Spring Boot 开发框架和 Spark MLlib 机器学习框架,通过 TF-IDF 和 Bayes 算法,训练出一个商品类目预测模型。该模型可以根据商品名称自动预测出商品类目。项目对外提供 RESTFul 接口。
Created 2018-12-04
15 commits to master branch, last one 5 years ago
68
138
mit
8
Social Analysis based on Whatsapp data
Created 2018-10-17
26 commits to master branch, last one 4 years ago
Fast, efficient, in-memory Full Text Search for Kotlin
Created 2022-01-23
17 commits to main branch, last one 2 years ago
An example project using a feed-forward neural network for text sentiment classification trained with 25,000 movie reviews from the IMDB website.
Created 2018-12-09
57 commits to master branch, last one about a year ago
A structured collection of notes (mostly, on machine learning) and a Flask app for reading and searching them.
Created 2017-11-11
402 commits to master branch, last one 4 months ago
Emacs package that helps org-mode users (re)discover similar documents
Created 2020-11-26
130 commits to main branch, last one 9 months ago
Given a job title and job description, the algorithm assigns a standard occupational classification (SOC) code to the job.
Created 2017-11-08
98 commits to package branch, last one about a year ago
17
65
gpl-3.0
13
The greynir.is Icelandic natural language processing API and website.
Created 2015-05-06
3,405 commits to master branch, last one 3 months ago
21
64
apache-2.0
8
FBLYZE is a Facebook scraping system and analysis system.
Created 2016-12-21
233 commits to master branch, last one 5 years ago
Extracts key terminology (n-grams) from any large collection of documents (>1000) and forecasts emergence
Created 2018-08-20
728 commits to develop branch, last one 3 years ago
Pipeline for fast building text classification TF-IDF + LogReg baselines.
Created 2021-07-27
280 commits to main branch, last one 3 years ago
Fast Full Text Search based on BM25
Created 2017-05-25
82 commits to master branch, last one 2 years ago
A simple tool to generate tags for the given text (document) using TF-IDF.
Created 2017-02-22
21 commits to master branch, last one 3 years ago