Statistics for topic spark
RepositoryStats tracks 631,873 Github repositories, of these 552 are tagged with the spark topic. The most common primary language for repositories using this topic is Scala (142). Other languages include: Java (104), Python (102), Jupyter Notebook (63), JavaScript (18), Shell (12), Go (11)
Stargazers over time for topic spark
Most starred repositories for topic spark (view more)
Trending repositories for topic spark (view more)
Data Engineering Zoomcamp is a free nine-week course that covers the fundamentals of data engineering.
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
Apache Doris is an easy-to-use, high performance and unified analytics database.
Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
Apache Spark - A unified analytics engine for large-scale data processing
🎹 Moodify - an emotion-based music recommendation system that uses AI/ML models to analyze text, speech, and facial expressions, providing personalized music recommendations across web and mobile pla...
💭 一个可二次开发 Chat Bot 单轮对话 Web 端 MVP 原型模板, 基于 Vue 3, Vite 6, TypeScript, Naive UI, Pinia(v3), UnoCSS 等主流技术构建, 🧤简单集成大模型 API, 采用单轮 AI 问答对话模式, 每次提问独立响应, 无需上下文, 支持打字机效果流式输出, 集成 markdown-it 公式高亮预览, Deepsee...
Big data computing platform based on Spark <至轻云-超轻量级大数据计算平台/数据中台>
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
Data Engineering Zoomcamp is a free nine-week course that covers the fundamentals of data engineering.
Apache Spark - A unified analytics engine for large-scale data processing
Open source project for data preparation of LLM application builders
Apache Doris is an easy-to-use, high performance and unified analytics database.
🎹 Moodify - an emotion-based music recommendation system that uses AI/ML models to analyze text, speech, and facial expressions, providing personalized music recommendations across web and mobile pla...
Formally verified, real-time capable, UNIX-like operating system kernel written in SPARK and Ada.
Open source project for data preparation of LLM application builders
💭 一个可二次开发 Chat Bot 单轮对话 Web 端 MVP 原型模板, 基于 Vue 3, Vite 6, TypeScript, Naive UI, Pinia(v3), UnoCSS 等主流技术构建, 🧤简单集成大模型 API, 采用单轮 AI 问答对话模式, 每次提问独立响应, 无需上下文, 支持打字机效果流式输出, 集成 markdown-it 公式高亮预览, Deepsee...
Data Engineering Zoomcamp is a free nine-week course that covers the fundamentals of data engineering.
Apache Spark - A unified analytics engine for large-scale data processing
Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
Apache Doris is an easy-to-use, high performance and unified analytics database.
💭 一个可二次开发 Chat Bot 单轮对话 Web 端 MVP 原型模板, 基于 Vue 3, Vite 6, TypeScript, Naive UI, Pinia(v3), UnoCSS 等主流技术构建, 🧤简单集成大模型 API, 采用单轮 AI 问答对话模式, 每次提问独立响应, 无需上下文, 支持打字机效果流式输出, 集成 markdown-it 公式高亮预览, Deepsee...
This is a repository to demonstrate my details, skills, projects and to keep track of my progression in Data Analytics and Data Science topics.
基于Spark+SparkMLlib+Debezium打造的简单易用、超高性能大数据治理引擎,适用于批流一体的数据集成和数据分析,支持机器学习算法模型、支持CDC实时数据采集,数据建模、算法建模和OLAP数据分析
🎹 Moodify - an emotion-based music recommendation system that uses AI/ML models to analyze text, speech, and facial expressions, providing personalized music recommendations across web and mobile pla...
Open source project for data preparation of LLM application builders
💭 一个可二次开发 Chat Bot 单轮对话 Web 端 MVP 原型模板, 基于 Vue 3, Vite 6, TypeScript, Naive UI, Pinia(v3), UnoCSS 等主流技术构建, 🧤简单集成大模型 API, 采用单轮 AI 问答对话模式, 每次提问独立响应, 无需上下文, 支持打字机效果流式输出, 集成 markdown-it 公式高亮预览, Deepsee...
Open Source LeetCode for PySpark, Spark, Pandas and DBT/Snowflake
🚀 讯飞星火大模型逆向API【特长:办公助手】,支持高速流式输出、智能体对话、联网搜索、AI绘图、长文档解读、图像解析、多轮对话,零配置部署,多路token支持,自动清理会话痕迹,仅供测试,如需商用请前往官方开放平台。。
Data Engineering Zoomcamp is a free nine-week course that covers the fundamentals of data engineering.
Apache Spark - A unified analytics engine for large-scale data processing
Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
Apache Doris is an easy-to-use, high performance and unified analytics database.
Big data computing platform based on Spark <至轻云-超轻量级大数据计算平台/数据中台>
This is a repository to demonstrate my details, skills, projects and to keep track of my progression in Data Analytics and Data Science topics.
Large Tech Knowledge Base from 20 years in DevOps, Linux, Cloud, Big Data, AWS, GCP etc - gradually porting my large private knowledge base to public
End to end data engineering project with kafka, airflow, spark, postgres and docker.