Statistics for topic data
RepositoryStats tracks 579,129 Github repositories, of these 967 are tagged with the data topic. The most common primary language for repositories using this topic is Python (270). Other languages include: TypeScript (90), JavaScript (87), Jupyter Notebook (74), Go (38), R (38), Java (33), HTML (30), Rust (28), C++ (19)
Stargazers over time for topic data
Most starred repositories for topic data (view more)
Trending repositories for topic data (view more)
This is a repo with links to everything you'd ever want to learn about data engineering
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.
Ylem is an open-source platform for real-time data streaming orchestration
Scrapyman数据接口服务。提供:淘宝、小红书、京东、抖音(电商)、抖音(视频)、快手、蒲公英、星图、拼多多、微信公众号、大众点评、哔哩哔哩、知乎、微博、贝壳、Bigo、Temu、Lazada、Shopee、SHEIN、百度指数、携程、Boss直聘、智联招聘、拉钩、今日头条、Facebook、Youtube、Instgram、Twitter。爬虫、采集、scrapy、接口、API。
pgCompare – a straightforward utility crafted to simplify the data comparison process, providing a robust solution for comparing data across various database platforms.
This is a repo with links to everything you'd ever want to learn about data engineering
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.
Ylem is an open-source platform for real-time data streaming orchestration
Scrapyman数据接口服务。提供:淘宝、小红书、京东、抖音(电商)、抖音(视频)、快手、蒲公英、星图、拼多多、微信公众号、大众点评、哔哩哔哩、知乎、微博、贝壳、Bigo、Temu、Lazada、Shopee、SHEIN、百度指数、携程、Boss直聘、智联招聘、拉钩、今日头条、Facebook、Youtube、Instgram、Twitter。爬虫、采集、scrapy、接口、API。
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
This is a repo with links to everything you'd ever want to learn about data engineering
Ylem is an open-source platform for real-time data streaming orchestration
2025 AI/ML internship & new graduate job list updated daily
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
This is a repo with links to everything you'd ever want to learn about data engineering
LLM based data scientist, AI native data application. AI-driven infinite thinking redefines BI.
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
This is a repo with links to everything you'd ever want to learn about data engineering
A configuration as code language with rich validation and tooling.
🤖 Powerful asynchronous state management, server-state utilities and data fetching for the web. TS/JS, React Query, Solid Query, Svelte Query and Vue Query.
LLM based data scientist, AI native data application. AI-driven infinite thinking redefines BI.
Neum AI is a best-in-class framework to manage the creation and synchronization of vector embeddings at large scale.
2025 AI/ML internship & new graduate job list updated daily