13 results found Sort:

A list of AI agents and robots to block.
Created 2024-03-27
336 commits to main branch, last one 12 hours ago
78
978
unlicense
12
🤖/👨‍🦰 Detect bots/crawlers/spiders using the user agent string
Created 2015-07-24
329 commits to main branch, last one 4 days ago
92
352
other
40
An R web crawler and scraper
Created 2016-11-08
201 commits to master branch, last one 4 years ago
Open source SEO auditing tool.
Created 2022-03-02
785 commits to main branch, last one 16 days ago
67
186
apache-2.0
33
Norconex Crawlers (or spiders) are flexible web and filesystem crawlers for collecting, parsing, and manipulating data from the web or filesystem to various data repositories such as search engines.
Created 2013-02-20
1,245 commits to main branch, last one 16 days ago
10
148
zlib
7
Astray is a lua based maze, room and dungeon generation library for dungeon crawlers and rougelike video games
Created 2014-01-08
20 commits to master branch, last one about a month ago
15
109
gpl-3.0
20
Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.
Created 2019-09-07
4,429 commits to v1.21.3-at branch, last one 2 months ago
Simple robots.txt template. Keep unwanted robots out (disallow). White lists (allow) legitimate user-agents. Useful for all websites.
Created 2016-01-08
303 commits to master branch, last one 11 months ago
Vietnamese text data crawler scripts for various sites (including Youtube, Facebook, 4rum, news, ...)
This repository has been archived (exclude archived)
Created 2020-02-28
14 commits to master branch, last one 2 years ago
hproxy - Asynchronous IP proxy pool, aims to make getting proxy as convenient as possible.(异步爬虫代理池)
Created 2018-04-06
69 commits to master branch, last one 6 years ago
Raven is a powerful and customizable web crawler written in Go.
Created 2024-05-08
9 commits to main branch, last one 5 months ago
0
37
bsd-3-clause
1
Sneakpeek is a framework that helps to quickly and conviniently develop scrapers. It’s the best choice for scrapers that have some specific complex scraping logic that needs to be run on a constant ba...
Created 2021-01-17
190 commits to main branch, last one about a year ago