26 results found Sort:

1.8k
11.7k
bsd-3-clause
214
Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台,支持任何语言和框架
Created 2019-02-10
2,674 commits to main branch, last one 5 months ago
新一代爬虫平台,以图形化方式定义爬虫流程,不写代码即可完成爬虫。
Created 2020-03-27
482 commits to master branch, last one 2 years ago
新闻网页正文通用抽取器 Beta 版.
Created 2019-09-08
154 commits to master branch, last one 9 months ago
592
2.0k
other
79
蓝天采集器是一款开源免费的爬虫系统,仅需点选编辑规则即可采集数据,可运行在本地、虚拟主机或云服务器中,几乎能采集所有类型的网页,无缝对接各类CMS建站程序,免登录实时发布数据,全自动无需人工干预!是网页大数据采集软件中完全跨平台的云端爬虫系统
Created 2018-02-26
27 commits to master branch, last one 5 days ago
52
1.4k
mit
17
A Unix-style personal search engine and web crawler for your digital footprint.
Created 2021-07-17
40 commits to master branch, last one 3 years ago
160
852
bsd-3-clause
44
HTTP API for Scrapy spiders
Created 2015-01-06
247 commits to master branch, last one about a year ago
70
634
unknown
10
Advance web security spider/crawler
Created 2023-04-05
26 commits to main branch, last one about a year ago
Uscrapper Vanta: Dive deeper into the web with this powerful open-source tool. Extract valuable insights with ease and efficiency, from both surface and deep web sources. Empower your data mining and ...
Created 2023-05-31
63 commits to main branch, last one 4 months ago
189
505
apache-2.0
76
Open-source Enterprise Grade Search Engine Software
Created 2013-07-18
5,642 commits to master branch, last one 3 years ago
《Python爬虫开发 从入门到实战》配套源代码。
Created 2018-10-01
27 commits to master branch, last one 5 years ago
91
354
other
39
An R web crawler and scraper
Created 2016-11-08
201 commits to master branch, last one 4 years ago
DotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like Web...
Created 2019-02-19
55 commits to master branch, last one 5 years ago
44
166
gpl-3.0
6
This program provides efficient web scraping services for Tor and non-Tor sites. The program has both a CLI and REST API.
Created 2018-06-02
223 commits to main branch, last one 11 months ago
A web crawling framework written in Kotlin
Created 2016-10-24
211 commits to master branch, last one 4 years ago
28
119
gpl-3.0
5
Web scraper, crawler and parser in C#. Designed as simple, declarative and scalable web scraping solution.
Created 2022-04-12
618 commits to master branch, last one 5 months ago
27
98
unknown
6
Stick to doing something interesting and valuable.
Created 2018-10-19
1,115 commits to master branch, last one about a year ago
A Web Crawler based on LLMs implemented with Ray and Huggingface. The embeddings are saved into a vector database for fast clustering and retrieval. Use it for your RAG.
Created 2023-09-28
9 commits to main branch, last one about a year ago
The data and code that used in my book.
Created 2017-07-01
18 commits to master branch, last one 9 months ago
8
61
unknown
1
2019 nCoV realtime track system based Scrapy + influxdb + grafana + NLTK + Stanford CoreNLP
Created 2020-02-02
8 commits to master branch, last one 4 years ago
9
58
apache-2.0
7
:sparkles: :dna: Turing ES - Enterprise Search, Semantic Navigation, Chatbot using Search Engine and Generative AI.
Created 2016-10-12
4,792 commits to main branch, last one 19 days ago
A web browser :earth_americas: hosted as a service, to render your JavaScript web pages as HTML
Created 2017-02-09
2,496 commits to master branch, last one 3 years ago
Bot para monitoramento de promoções no fórum do Hardmob http://www.hardmob.com.br/promocoes/
Created 2017-12-31
101 commits to master branch, last one 3 years ago
A set of useful and scalable spiders to crawl data/videos from bilibili, xiaohongshu, etc.
Created 2023-11-09
125 commits to main branch, last one about a year ago
An declarative and easy to use web crawler and scraper in C#
Created 2024-07-05
24 commits to main branch, last one 6 months ago
4
25
gpl-3.0
4
CodeBRT is an AI program generation plugin for VSCode. It helps you quickly generate code through AI, thus improving development efficiency.
Created 2024-07-04
917 commits to main branch, last one 20 hours ago