Trending repositories for topic scrapy

Last 3 days (new repositories)

no newly created repositories trending in the last 3 days

Last 3 days (absolute gain)

Faker-lz/Topic_and_user_profile_analysis_system

基于微博的网络舆情话题分析和用户画像系统

389 (+5)

crawlab-team/crawlab

Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台，支持任何语言和框架

11,415 (+4)

bsd-3-clause

scrapy-plugins/scrapy-playwright

🎭 Playwright integration for Scrapy

1,053 (+3)

bsd-3-clause

Lan-ce-lot/pythorch-text-classification

对豆瓣影评进行文本分类情感分析，利用爬虫豆瓣爬取评论，进行数据清洗，分词，采用BERT、CNN、LSTM等模型进行训练，采用tensorboardX可视化训练过程，自然语言处理项目\A project for text classification, based on torch 1.7.1

123 (+2)

apache-2.0

alanchn31/Data-Engineering-Projects

Personal Data Engineering Projects

876 (+2)

ityard/python-fxxk-spider

收集各种免费的 Python 爬虫项目

1,084 (+2)

apache-2.0

TheWebScrapingClub/webscraping-from-0-to-hero

The web scraping open project repository aims to share knowledge and experiences about web scraping with Python

1,566 (+2)

LuckyZXL2016/Movie_Recommend

基于Spark的电影推荐系统，包含爬虫项目、web网站、后台管理系统以及spark推荐系统

2,846 (+2)

mit

Boris-code/feapder

🚀🚀🚀feapder is an easy to use, powerful crawler framework | feapder是一款上手简单，功能强大的Python爬虫框架。内置AirSpider、Spider、TaskSpider、BatchSpider四种爬虫解决不同场景的需求。且支持断点续爬、监控报警、浏览器渲染、海量数据去重等功能。更有功能强大的爬虫管理系统feaplat为其提...

3,034 (+2)

rmax/scrapy-redis

Redis-based components for Scrapy.

5,551 (+2)

mit

lining0806/PythonSpiderNotes

Python入门网络爬虫之精华版

7,016 (+2)

chyroc/WechatSogou

基于搜狗微信搜索的微信公众号爬虫接口

5,946 (+2)

apache-2.0

mikumifa/BiliShareMall

图形化 B站会员购魔力赏市集爬虫搜索脚本

50 (+1)

moyada/stealer

抖音、快手、火山、皮皮虾，视频去水印程序

1,002 (+1)

mit

librauee/Reptile

🏀 Python3 网络爬虫实战（部分含详细教程）猫眼腾讯视频豆瓣研招网微博笔趣阁小说百度热点 B站 CSDN 网易云阅读阿里文学百度股票今日头条微信公众号网易云音乐拉勾有道 unsplash 实习僧汽车之家英雄联盟盒子大众点评链家 LPL赛程台风梦幻西游、阴阳师藏宝阁天气牛客网百度文库睡前故事知乎 Wish

1,615 (+1)

Last 3 days (relative gain)

mikumifa/BiliShareMall

图形化 B站会员购魔力赏市集爬虫搜索脚本

50 (+2%)

Lan-ce-lot/pythorch-text-classification

123 (+2%)

apache-2.0

Faker-lz/Topic_and_user_profile_analysis_system

基于微博的网络舆情话题分析和用户画像系统

389 (+1%)

scrapy-plugins/scrapy-playwright

🎭 Playwright integration for Scrapy

1,053 (+0.3%)

bsd-3-clause

alanchn31/Data-Engineering-Projects

Personal Data Engineering Projects

876 (+0.2%)

ityard/python-fxxk-spider

收集各种免费的 Python 爬虫项目

1,084 (+0.2%)

apache-2.0

TheWebScrapingClub/webscraping-from-0-to-hero

The web scraping open project repository aims to share knowledge and experiences about web scraping with Python

1,566 (+0.1%)

moyada/stealer

抖音、快手、火山、皮皮虾，视频去水印程序

1,002 (+0.1%)

mit

LuckyZXL2016/Movie_Recommend

基于Spark的电影推荐系统，包含爬虫项目、web网站、后台管理系统以及spark推荐系统

2,846 (+0.1%)

mit

Boris-code/feapder

3,034 (+0.1%)

librauee/Reptile

1,615 (+0.1%)

rmax/scrapy-redis

Redis-based components for Scrapy.

5,551 (+0.0%)

mit

crawlab-team/crawlab

Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台，支持任何语言和框架

11,415 (+0.0%)

bsd-3-clause

chyroc/WechatSogou

基于搜狗微信搜索的微信公众号爬虫接口

5,946 (+0.0%)

apache-2.0

lining0806/PythonSpiderNotes

Python入门网络爬虫之精华版

7,016 (+0.0%)

Last week (new repositories)

no newly created repositories trending in the last week

Last week (absolute gain)

Faker-lz/Topic_and_user_profile_analysis_system

基于微博的网络舆情话题分析和用户画像系统

389 (+12)

Boris-code/feapder

3,034 (+10)

scrapy-plugins/scrapy-playwright

🎭 Playwright integration for Scrapy

1,053 (+8)

bsd-3-clause

LuckyZXL2016/Movie_Recommend

基于Spark的电影推荐系统，包含爬虫项目、web网站、后台管理系统以及spark推荐系统

2,846 (+7)

mit

lining0806/PythonSpiderNotes

Python入门网络爬虫之精华版

7,016 (+6)

crawlab-team/crawlab

Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台，支持任何语言和框架

11,415 (+6)

bsd-3-clause

xingag/spider_python

python爬虫

995 (+5)

apache-2.0

DropsDevopsOrg/ECommerceCrawlers

实战🐍多种网站、电商数据爬虫🕷。包含🕸：淘宝商品、微信公众号、大众点评、企查查、招聘网站、闲鱼、阿里任务、博客园、微博、百度贴吧、豆瓣电影、包图网、全景网、豆瓣音乐、某省药监局、搜狐新闻、机器学习文本采集、fofa资产采集、汽车之家、国家统计局、百度关键词收录数、蜘蛛泛目录、今日头条、豆瓣影评、携程、小米应用商店、安居客、途家民宿❤️❤️❤️。微信爬虫展示项目:

4,811 (+5)

mit

unnohwn/telegram-scraper

A powerful Python script that allows you to scrape messages and media from Telegram channels using the Telethon library. Features include real-time continuous scraping, media downloading, and data exp...

99 (+4)

mit

chyroc/WechatSogou

基于搜狗微信搜索的微信公众号爬虫接口

5,946 (+4)

apache-2.0

mikumifa/BiliShareMall

图形化 B站会员购魔力赏市集爬虫搜索脚本

50 (+3)

wuzhy1ng/BlockchainSpider

A toolkit for blockchain data collection

128 (+3)

TikHub/TikHub-API-Python-SDK

High-performance asynchronous Douyin(抖音) TikTok Xiaohongshu(小红书) Kuaishou(快手) Weibo(微博) Instagram YouTube(油管) Twitter(X) Captcha Solver(验证码解决器) Temp Mail(临时邮箱) API(接口).

365 (+3)

apache-2.0

casual-silva/NewsCrawl

狠心开源企业级舆情新闻爬虫项目：支持任意数量爬虫一键运行、爬虫定时任务、爬虫批量删除；爬虫一键部署；爬虫监控可视化; 配置集群爬虫分配策略；👉 现成的docker一键部署文档已为大家踩坑

556 (+3)

eracle/linkedin

Linkedin Scraper using Selenium Web Driver, Chromium headless, Docker and Scrapy

803 (+3)

ityard/python-fxxk-spider

收集各种免费的 Python 爬虫项目

1,084 (+3)

apache-2.0

eliasdabbas/advertools

advertools - online marketing productivity and analysis tools

1,164 (+3)

mit

librauee/Reptile

1,615 (+3)

rmax/scrapy-redis

Redis-based components for Scrapy.

5,551 (+3)

mit

Lan-ce-lot/pythorch-text-classification

123 (+2)

apache-2.0

Last week (relative gain)

mikumifa/BiliShareMall

图形化 B站会员购魔力赏市集爬虫搜索脚本

50 (+6%)

unnohwn/telegram-scraper

99 (+4%)

mit

Faker-lz/Topic_and_user_profile_analysis_system

基于微博的网络舆情话题分析和用户画像系统

389 (+3%)

wuzhy1ng/BlockchainSpider

A toolkit for blockchain data collection

128 (+2%)

Lan-ce-lot/pythorch-text-classification

123 (+2%)

apache-2.0

Lan-ce-lot/weibo-opinion-analysis

针对微博平台的微博文本数据进行舆情分析项目，内容有微博爬虫、LDA主题分析和情感分析

62 (+2%)

apache-2.0

hwaves/m3u8_To_MP4

Python downloader for saving m3u8 videos to local MP4 files.

73 (+1%)

mit

jxlil/scrapy-impersonate

Scrapy download handler that can impersonate browser' TLS signatures or JA3 fingerprints.

114 (+0.9%)

mit

TikHub/TikHub-API-Python-SDK

365 (+0.8%)

apache-2.0

scrapy-plugins/scrapy-playwright

🎭 Playwright integration for Scrapy

1,053 (+0.8%)

bsd-3-clause

rockswang/java-curl

Pure java CURL implementation

173 (+0.6%)

apache-2.0

casual-silva/NewsCrawl

556 (+0.5%)

xingag/spider_python

python爬虫

995 (+0.5%)

apache-2.0

eracle/linkedin

Linkedin Scraper using Selenium Web Driver, Chromium headless, Docker and Scrapy

803 (+0.4%)

Boris-code/feapder

3,034 (+0.3%)

ityard/python-fxxk-spider

收集各种免费的 Python 爬虫项目

1,084 (+0.3%)

apache-2.0

eliasdabbas/advertools

advertools - online marketing productivity and analysis tools

1,164 (+0.3%)

mit

LuckyZXL2016/Movie_Recommend

基于Spark的电影推荐系统，包含爬虫项目、web网站、后台管理系统以及spark推荐系统

2,846 (+0.2%)

mit

alanchn31/Data-Engineering-Projects

Personal Data Engineering Projects

876 (+0.2%)

moyada/stealer

抖音、快手、火山、皮皮虾，视频去水印程序

1,002 (+0.2%)

mit

Last month (new repositories)

no newly created repositories trending in the last month

Last month (absolute gain)

Boris-code/feapder

3,034 (+56)

crawlab-team/crawlab

Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台，支持任何语言和框架

11,415 (+54)

bsd-3-clause

lining0806/PythonSpiderNotes

Python入门网络爬虫之精华版

7,016 (+50)

DropsDevopsOrg/ECommerceCrawlers

4,811 (+43)

mit

unnohwn/telegram-scraper

99 (+33)

mit

kkoooqq/fakebrowser

🤖 Fake fingerprints to bypass anti-bot systems. Simulate mouse and keyboard operations to make behavior like a real person.

1,198 (+30)

lgpl-3.0

Faker-lz/Topic_and_user_profile_analysis_system

基于微博的网络舆情话题分析和用户画像系统

389 (+27)

mikumifa/BiliShareMall

图形化 B站会员购魔力赏市集爬虫搜索脚本

50 (+23)

LuckyZXL2016/Movie_Recommend

基于Spark的电影推荐系统，包含爬虫项目、web网站、后台管理系统以及spark推荐系统

2,846 (+22)

mit

nghuyong/WeiboSpider

持续维护的新浪微博采集工具🚀🚀🚀

3,711 (+22)

mit

scrapy-plugins/scrapy-playwright

🎭 Playwright integration for Scrapy

1,053 (+21)

bsd-3-clause

eracle/linkedin

Linkedin Scraper using Selenium Web Driver, Chromium headless, Docker and Scrapy

803 (+20)

QianyanTech/Image-Downloader

Download images from Google, Bing, Baidu. 谷歌、百度、必应图片下载.

2,239 (+20)

mit

chyroc/WechatSogou

基于搜狗微信搜索的微信公众号爬虫接口

5,946 (+20)

apache-2.0

wkunzhi/Python3-Spider

Python爬虫实战 - 模拟登陆各大网站包含但不限于：滑块验证、拼多多、美团、百度、bilibili、大众点评、淘宝，如果喜欢请start ❤️

3,074 (+19)

my8100/scrapydweb

Web app for Scrapyd cluster management, Scrapy log analysis & visualization, Auto packaging, Timer tasks, Monitor & Alert, and Mobile UI. DEMO :point_right:

3,181 (+16)

gpl-3.0

TikHub/TikHub-API-Python-SDK

365 (+15)

apache-2.0

rmax/scrapy-redis

Redis-based components for Scrapy.

5,551 (+14)

mit

alanchn31/Data-Engineering-Projects

Personal Data Engineering Projects

876 (+13)

librauee/Reptile

1,615 (+13)

Last month (relative gain)

mikumifa/BiliShareMall

图形化 B站会员购魔力赏市集爬虫搜索脚本

50 (+85%)

unnohwn/telegram-scraper

99 (+50%)

mit

sucv/paperCrawler

This is a Scrapy-based web-spider. It scrapes papers from TOP conferences and journals.

36 (+9%)

wuzhy1ng/BlockchainSpider

A toolkit for blockchain data collection

128 (+8%)

Faker-lz/Topic_and_user_profile_analysis_system

基于微博的网络舆情话题分析和用户画像系统

389 (+7%)

joaopauloaramuni/python

Repo Python

45 (+7%)

mit

Lan-ce-lot/pythorch-text-classification

123 (+5%)

apache-2.0

Lan-ce-lot/weibo-opinion-analysis

针对微博平台的微博文本数据进行舆情分析项目，内容有微博爬虫、LDA主题分析和情感分析

62 (+5%)

apache-2.0

corralm/yc-scraper

✌️Y Combinator directory scraper

43 (+5%)

mit

TikHub/TikHub-API-Python-SDK

365 (+4%)

apache-2.0

schesa/ImgFlip575K_Dataset

🤡 575K memes from ImgFlip

51 (+4%)

doforce/github-trending

GitHub trending repositories and developers APIs for real time, powered by crawlers | 通过爬虫获取 GitHub 热门项目和开发者的实时 API

56 (+4%)

mit

klane/databall

Betting on the NBA with data

141 (+3%)

mit

hwaves/m3u8_To_MP4

Python downloader for saving m3u8 videos to local MP4 files.

73 (+3%)

mit

jxlil/scrapy-impersonate

Scrapy download handler that can impersonate browser' TLS signatures or JA3 fingerprints.

114 (+3%)

mit

kkoooqq/fakebrowser

🤖 Fake fingerprints to bypass anti-bot systems. Simulate mouse and keyboard operations to make behavior like a real person.

1,198 (+3%)

lgpl-3.0

eracle/linkedin

Linkedin Scraper using Selenium Web Driver, Chromium headless, Docker and Scrapy

803 (+3%)

ahmedshahriar/bd-medicine-scraper

Scrapy-Django PostgreSQL integrated API with Proxy IP configuration that scrapes all medicine data (meds, prices, generics, companies, indications) from Bangladesh (30k+ pages)

47 (+2%)

apache-2.0

gsxhnd/garage

Set of crawl service for javbus

49 (+2%)

mit

scrapy-plugins/scrapy-playwright

🎭 Playwright integration for Scrapy

1,053 (+2%)

bsd-3-clause

Last 12-months (new repositories)

unnohwn/telegram-scraper

mit

mikumifa/BiliShareMall

图形化 B站会员购魔力赏市集爬虫搜索脚本

joaopauloaramuni/python

Repo Python

mit

hehehai/x-hiring-grab

🤗 每日最新招聘信息，数据抓取 AI 分析程序

Last 12-months (absolute gain)

crawlab-team/crawlab

Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台，支持任何语言和框架

11,415 (+934)

bsd-3-clause

Boris-code/feapder

3,034 (+731)

DropsDevopsOrg/ECommerceCrawlers

4,811 (+601)

mit

lining0806/PythonSpiderNotes

Python入门网络爬虫之精华版

7,016 (+564)

ityard/python-fxxk-spider

收集各种免费的 Python 爬虫项目

1,084 (+512)

apache-2.0

nghuyong/WeiboSpider

持续维护的新浪微博采集工具🚀🚀🚀

3,711 (+437)

mit

wkunzhi/Python3-Spider

Python爬虫实战 - 模拟登陆各大网站包含但不限于：滑块验证、拼多多、美团、百度、bilibili、大众点评、淘宝，如果喜欢请start ❤️

3,074 (+418)

scrapy-plugins/scrapy-playwright

🎭 Playwright integration for Scrapy

1,053 (+343)

bsd-3-clause

Faker-lz/Topic_and_user_profile_analysis_system

基于微博的网络舆情话题分析和用户画像系统

389 (+288)

my8100/scrapydweb

Web app for Scrapyd cluster management, Scrapy log analysis & visualization, Auto packaging, Timer tasks, Monitor & Alert, and Mobile UI. DEMO :point_right:

3,181 (+281)

gpl-3.0

chyroc/WechatSogou

基于搜狗微信搜索的微信公众号爬虫接口

5,946 (+269)

apache-2.0

alanchn31/Data-Engineering-Projects

Personal Data Engineering Projects

876 (+267)

kkoooqq/fakebrowser

🤖 Fake fingerprints to bypass anti-bot systems. Simulate mouse and keyboard operations to make behavior like a real person.

1,198 (+267)

lgpl-3.0

LuckyZXL2016/Movie_Recommend

基于Spark的电影推荐系统，包含爬虫项目、web网站、后台管理系统以及spark推荐系统

2,846 (+254)

mit

QianyanTech/Image-Downloader

Download images from Google, Bing, Baidu. 谷歌、百度、必应图片下载.

2,239 (+234)

mit

Gerapy/Gerapy

Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Django and Vue.js

3,365 (+233)

mit

TikHub/TikHub-API-Python-SDK

365 (+227)

apache-2.0

eracle/linkedin

Linkedin Scraper using Selenium Web Driver, Chromium headless, Docker and Scrapy

803 (+215)

moyada/stealer

抖音、快手、火山、皮皮虾，视频去水印程序

1,002 (+201)

mit

eliasdabbas/advertools

advertools - online marketing productivity and analysis tools

1,164 (+200)

mit

Last 12-months (relative gain)

jxlil/scrapy-impersonate

Scrapy download handler that can impersonate browser' TLS signatures or JA3 fingerprints.

114 (+660%)

mit

Faker-lz/Topic_and_user_profile_analysis_system

基于微博的网络舆情话题分析和用户画像系统

389 (+285%)

hehehai/x-hiring-grab

🤗 每日最新招聘信息，数据抓取 AI 分析程序

31 (+244%)

corralm/yc-scraper

✌️Y Combinator directory scraper

43 (+231%)

mit

lizongying/go-crawler

A web crawling framework implemented in Golang, it is simple to write and delivers powerful performance. It comes with a wide range of practical middleware and supports various parsing and storage met...

114 (+200%)

TikHub/TikHub-API-Python-SDK

365 (+164%)

apache-2.0

tech-engine/goscrapy

GoScrapy: Harnessing Go's power for blazingly fast web scraping, inspired by Python's Scrapy framework.

89 (+154%)

Lan-ce-lot/weibo-opinion-analysis

针对微博平台的微博文本数据进行舆情分析项目，内容有微博爬虫、LDA主题分析和情感分析

62 (+138%)

apache-2.0

sfvsfv/Crawer

《Python网络爬虫入门到实战》配套程序。爬虫项目集合，

31 (+121%)

mit

shengchenyang/AyugeSpiderTools

使 scrapy 开发不用在意 item，pipeline，middleware 等通用场景下模块的编写，解放开发者的双手。

79 (+108%)

mit

ityard/python-fxxk-spider

收集各种免费的 Python 爬虫项目

1,084 (+90%)

apache-2.0

Lan-ce-lot/pythorch-text-classification

123 (+78%)

apache-2.0

ahmedshahriar/bd-medicine-scraper

Scrapy-Django PostgreSQL integrated API with Proxy IP configuration that scrapes all medicine data (meds, prices, generics, companies, indications) from Bangladesh (30k+ pages)

47 (+74%)

apache-2.0

wuzhy1ng/BlockchainSpider

A toolkit for blockchain data collection

128 (+56%)

ndgigliotti/shopify-spy

Extract structured data from Shopify websites.

85 (+52%)

sucv/paperCrawler

This is a Scrapy-based web-spider. It scrapes papers from TOP conferences and journals.

36 (+50%)

gsxhnd/garage

Set of crawl service for javbus

49 (+48%)

mit

scrapy-plugins/scrapy-playwright

🎭 Playwright integration for Scrapy

1,053 (+48%)

bsd-3-clause

hwaves/m3u8_To_MP4

Python downloader for saving m3u8 videos to local MP4 files.

73 (+46%)

mit

alanchn31/Data-Engineering-Projects

Personal Data Engineering Projects

876 (+44%)