Statistics for topic data
RepositoryStats tracks 560,267 Github repositories, of these 930 are tagged with the data topic. The most common primary language for repositories using this topic is Python (257). Other languages include: TypeScript (89), JavaScript (80), Jupyter Notebook (73), R (37), Go (36), Java (34), HTML (28), Rust (26), C# (18)
Stargazers over time for topic data
Most starred repositories for topic data (view more)
Trending repositories for topic data (view more)
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
🤖 Powerful asynchronous state management, server-state utilities and data fetching for the web. TS/JS, React Query, Solid Query, Svelte Query and Vue Query.
Infinitely scalable, event-driven, language-agnostic orchestration and scheduling platform to manage millions of workflows declaratively in code.
LakeSail's computation framework with a mission to unify stream processing, batch processing, and compute-intensive (AI) workloads.
LakeSail's computation framework with a mission to unify stream processing, batch processing, and compute-intensive (AI) workloads.
Chat with your data, modify it, visualize it, create and test machine learning models all in plain English. DataHorse makes data analysis and data science conversational using LLMs.
Analytics for developers. Setup Analytics in 30 seconds with just one line of code. Display all your data on an AI-powered dashboard. Fully self-hostable and GDPR compliant.
Open source project for data preparation of LLM application builders
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
LakeSail's computation framework with a mission to unify stream processing, batch processing, and compute-intensive (AI) workloads.
🤖 Powerful asynchronous state management, server-state utilities and data fetching for the web. TS/JS, React Query, Solid Query, Svelte Query and Vue Query.
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
LakeSail's computation framework with a mission to unify stream processing, batch processing, and compute-intensive (AI) workloads.
Enterprise-grade toolkit for teams to continuously optimize compound AI systems, from pre to post-production
Chat with your data, modify it, visualize it, create and test machine learning models all in plain English. DataHorse makes data analysis and data science conversational using LLMs.
Fast, streaming indexing and query library for AI (RAG) applications, written in Rust
Chat with your data, modify it, visualize it, create and test machine learning models all in plain English. DataHorse makes data analysis and data science conversational using LLMs.
Enterprise-grade toolkit for teams to continuously optimize compound AI systems, from pre to post-production
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
🤖 Powerful asynchronous state management, server-state utilities and data fetching for the web. TS/JS, React Query, Solid Query, Svelte Query and Vue Query.
The simplest, fastest way to get business intelligence and analytics to everyone in your company :yum:
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
LakeSail's computation framework with a mission to unify stream processing, batch processing, and compute-intensive (AI) workloads.
Chat with your data, modify it, visualize it, create and test machine learning models all in plain English. DataHorse makes data analysis and data science conversational using LLMs.
Build super simple end-to-end data & ETL pipelines for your vector databases and Generative AI applications
Enterprise-grade toolkit for teams to continuously optimize compound AI systems, from pre to post-production
Command line interface for DuckDB, LibSQL, MariaDB, MySQL, PostgreSQL, Snowflake, SQLite3 and SQL Server
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
This is a repo with links to everything you'd ever want to learn about data engineering
LLM based data scientist, AI native data application. AI-driven infinite thinking redefines BI.
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
A configuration as code language with rich validation and tooling.
This is a repo with links to everything you'd ever want to learn about data engineering
🤖 Powerful asynchronous state management, server-state utilities and data fetching for the web. TS/JS, React Query, Solid Query, Svelte Query and Vue Query.
LLM based data scientist, AI native data application. AI-driven infinite thinking redefines BI.
Superduper: Integrate AI models and machine learning workflows with your database to implement custom AI applications, without moving your data. Including streaming inference, scalable model hosting, ...
The dbt data-validation toolkit for teams that care about building better data
🛠️ Tools for working with data effectively - data contracts using types, schemas, domain validation rules, type-safe casting, and more.