Trending repositories for topic datasets
CSGHub is an opensource large model assets platform just like on-premise huggingface which helps to manage datasets, model files, codes and more. CSGHub是一个开源、可信的大模型资产管理平台,可帮助用户治理LLM和LLM应用生命周期中涉及到的资产(数...
Label Studio is a multi-type data labeling and annotation tool with standardized output format
A curated list of language modeling researches for code and related datasets.
AKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
A repository that contains models, datasets, and fine-tuning techniques for DB-GPT, with the purpose of enhancing model performance in Text-to-SQL
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop....
FL Chart is a highly customizable Flutter chart library that supports Line Chart, Bar Chart, Pie Chart, Scatter Chart, and Radar Chart.
Techniques for deep learning with satellite & aerial imagery
A list of awesome papers and resources of recommender system on large language model (LLM).
TorchGeo: datasets, samplers, transforms, and pre-trained models for geospatial data
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
Langtrace 🔍 is an open-source, Open Telemetry based end-to-end observability tool for LLM applications, providing real-time tracing, evaluations and metrics for popular LLMs, LLM frameworks, vectorD...
CVPR 2023-2024 Papers: Dive into advanced research presented at the leading computer vision conference. Keep up to date with the latest developments in computer vision and deep learning. Code included...
CSGHub is an opensource large model assets platform just like on-premise huggingface which helps to manage datasets, model files, codes and more. CSGHub是一个开源、可信的大模型资产管理平台,可帮助用户治理LLM和LLM应用生命周期中涉及到的资产(数...
Langtrace 🔍 is an open-source, Open Telemetry based end-to-end observability tool for LLM applications, providing real-time tracing, evaluations and metrics for popular LLMs, LLM frameworks, vectorD...
WildlifeDatasets: An open-source toolkit for animal re-identification
Dataset and utilities for working with protein-protein interactions in 3D
A curated list of language modeling researches for code and related datasets.
CVPR 2023-2024 Papers: Dive into advanced research presented at the leading computer vision conference. Keep up to date with the latest developments in computer vision and deep learning. Code included...
AI Audio Datasets 🎵. A list of datasets consisting of speech, music, and sound effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development,...
White balance camera-rendered sRGB images (CVPR 2019) [Matlab & Python]
A curated list of datasets, publically available for machine learning research in the area of manufacturing
Awesome Chinese LLM: A curated list of Chinese Large Language Model 中文大语言模型数据集和模型资料汇总
A curated list of Human Preference Datasets for LLM fine-tuning, RLHF, and eval.
Croissant is a high-level format for machine learning datasets that brings together four rich layers.
A repository that contains models, datasets, and fine-tuning techniques for DB-GPT, with the purpose of enhancing model performance in Text-to-SQL
Generative modeling of synthetic time series data and time series augmentations
CSGHub is an opensource large model assets platform just like on-premise huggingface which helps to manage datasets, model files, codes and more. CSGHub是一个开源、可信的大模型资产管理平台,可帮助用户治理LLM和LLM应用生命周期中涉及到的资产(数...
Label Studio is a multi-type data labeling and annotation tool with standardized output format
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
AKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
A curated list of language modeling researches for code and related datasets.
Techniques for deep learning with satellite & aerial imagery
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop....
FL Chart is a highly customizable Flutter chart library that supports Line Chart, Bar Chart, Pie Chart, Scatter Chart, and Radar Chart.
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
A repository that contains models, datasets, and fine-tuning techniques for DB-GPT, with the purpose of enhancing model performance in Text-to-SQL
Croissant is a high-level format for machine learning datasets that brings together four rich layers.
CVPR 2023-2024 Papers: Dive into advanced research presented at the leading computer vision conference. Keep up to date with the latest developments in computer vision and deep learning. Code included...
Langtrace 🔍 is an open-source, Open Telemetry based end-to-end observability tool for LLM applications, providing real-time tracing, evaluations and metrics for popular LLMs, LLM frameworks, vectorD...
A list of awesome papers and resources of recommender system on large language model (LLM).
CSGHub is an opensource large model assets platform just like on-premise huggingface which helps to manage datasets, model files, codes and more. CSGHub是一个开源、可信的大模型资产管理平台,可帮助用户治理LLM和LLM应用生命周期中涉及到的资产(数...
🤖 Build AI applications with confidence ✅ DSPy Visualizer ✅ Understand how your users are using your LLM-app ✅ Get a full picture of the quality performance of your LLM-app ✅ Collaborate with your st...
Translate large dataset to any language with google translation api and multithread processing, no key required !
Langtrace 🔍 is an open-source, Open Telemetry based end-to-end observability tool for LLM applications, providing real-time tracing, evaluations and metrics for popular LLMs, LLM frameworks, vectorD...
A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.
WildlifeDatasets: An open-source toolkit for animal re-identification
Croissant is a high-level format for machine learning datasets that brings together four rich layers.
CVPR 2023-2024 Papers: Dive into advanced research presented at the leading computer vision conference. Keep up to date with the latest developments in computer vision and deep learning. Code included...
A list of datasets, tools, papers and code related to Deepfakes.
Dataset and utilities for working with protein-protein interactions in 3D
CSGHub Server is the backend server for CSGHub which helps user to manage datasets, model files, codes and more. CSGHub Server是开源大模型资产管理平台CSGHub的服务端部分的开源项目,提供基于REST API的模型和数据集等大模型资产管理功能。欢迎关注反馈和Star⭐️
"Enhancing LLM Factual Accuracy with RAG to Counter Hallucinations: A Case Study on Domain-Specific Queries in Private Knowledge-Bases" by Jiarui Li and Ye Yuan and Zehua Zhang
A curated list of language modeling researches for code and related datasets.
A curated list of datasets, publically available for machine learning research in the area of manufacturing
Label Studio is a multi-type data labeling and annotation tool with standardized output format
CSGHub is an opensource large model assets platform just like on-premise huggingface which helps to manage datasets, model files, codes and more. CSGHub是一个开源、可信的大模型资产管理平台,可帮助用户治理LLM和LLM应用生命周期中涉及到的资产(数...
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
AKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
Techniques for deep learning with satellite & aerial imagery
A curated list of language modeling researches for code and related datasets.
A list of awesome papers and resources of recommender system on large language model (LLM).
FL Chart is a highly customizable Flutter chart library that supports Line Chart, Bar Chart, Pie Chart, Scatter Chart, and Radar Chart.
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop....
A repository that contains models, datasets, and fine-tuning techniques for DB-GPT, with the purpose of enhancing model performance in Text-to-SQL
Langtrace 🔍 is an open-source, Open Telemetry based end-to-end observability tool for LLM applications, providing real-time tracing, evaluations and metrics for popular LLMs, LLM frameworks, vectorD...
A list of publicly available datasets with real-time data maintained by the team at bytewax.io
TorchGeo: datasets, samplers, transforms, and pre-trained models for geospatial data
Langtrace 🔍 is an open-source, Open Telemetry based end-to-end observability tool for LLM applications, providing real-time tracing, evaluations and metrics for popular LLMs, LLM frameworks, vectorD...
CSGHub is an opensource large model assets platform just like on-premise huggingface which helps to manage datasets, model files, codes and more. CSGHub是一个开源、可信的大模型资产管理平台,可帮助用户治理LLM和LLM应用生命周期中涉及到的资产(数...
🤖 Build AI applications with confidence ✅ DSPy Visualizer ✅ Understand how your users are using your LLM-app ✅ Get a full picture of the quality performance of your LLM-app ✅ Collaborate with your st...
Dataset and utilities for working with protein-protein interactions in 3D
WildlifeDatasets: An open-source toolkit for animal re-identification
A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.
📊 Adana - 1-click analytical dashboard for OSINT researchers
A list of publicly available datasets with real-time data maintained by the team at bytewax.io
Translate large dataset to any language with google translation api and multithread processing, no key required !
A list of datasets, tools, papers and code related to Deepfakes.
CSGHub Server is the backend server for CSGHub which helps user to manage datasets, model files, codes and more. CSGHub Server是开源大模型资产管理平台CSGHub的服务端部分的开源项目,提供基于REST API的模型和数据集等大模型资产管理功能。欢迎关注反馈和Star⭐️
CVPR 2023-2024 Papers: Dive into advanced research presented at the leading computer vision conference. Keep up to date with the latest developments in computer vision and deep learning. Code included...
Datasets, case studies and benchmarks for extracting structured information from PDFs, HTML files or images, created by the Parsee.ai team. Datasets also on Hugging Face: https://huggingface.co/parsee...
Awesome Chinese LLM: A curated list of Chinese Large Language Model 中文大语言模型数据集和模型资料汇总
A repository that contains models, datasets, and fine-tuning techniques for DB-GPT, with the purpose of enhancing model performance in Text-to-SQL
ICCV 2023 Papers: Discover cutting-edge research from ICCV 2023, the leading computer vision conference. Stay updated on the latest in computer vision and deep learning, with code included. ⭐ support ...
A curated list of language modeling researches for code and related datasets.
CSGHub is an opensource large model assets platform just like on-premise huggingface which helps to manage datasets, model files, codes and more. CSGHub是一个开源、可信的大模型资产管理平台,可帮助用户治理LLM和LLM应用生命周期中涉及到的资产(数...
CVPR 2023-2024 Papers: Dive into advanced research presented at the leading computer vision conference. Keep up to date with the latest developments in computer vision and deep learning. Code included...
This is the repository for our paper "INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning"
Langtrace 🔍 is an open-source, Open Telemetry based end-to-end observability tool for LLM applications, providing real-time tracing, evaluations and metrics for popular LLMs, LLM frameworks, vectorD...
Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.
This reposiotry is the collection for public 3D LiDAR datasets
A curated list of papers that released datasets along with their work
Data Engineering Pilipinas is a community for data engineers, data analysts, data scientists, developers, AI / ML engineers, and users of closed and open source data tools and methods / techniques in ...
BeaverTails is a collection of datasets designed to facilitate research on safety alignment in large language models (LLMs).
Official repository for the ICCV2023 paper "Kick Back & Relax: Learning to Reconstruct the World by Watching SlowTV"
Label Studio is a multi-type data labeling and annotation tool with standardized output format
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
AKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop....
Techniques for deep learning with satellite & aerial imagery
An open source multi-tool for exploring and publishing data
A repository that contains models, datasets, and fine-tuning techniques for DB-GPT, with the purpose of enhancing model performance in Text-to-SQL
ICCV 2023 Papers: Discover cutting-edge research from ICCV 2023, the leading computer vision conference. Stay updated on the latest in computer vision and deep learning, with code included. ⭐ support ...
FL Chart is a highly customizable Flutter chart library that supports Line Chart, Bar Chart, Pie Chart, Scatter Chart, and Radar Chart.
A curated list of language modeling researches for code and related datasets.
A list of awesome papers and resources of recommender system on large language model (LLM).
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
TorchGeo: datasets, samplers, transforms, and pre-trained models for geospatial data
CSGHub is an opensource large model assets platform just like on-premise huggingface which helps to manage datasets, model files, codes and more. CSGHub是一个开源、可信的大模型资产管理平台,可帮助用户治理LLM和LLM应用生命周期中涉及到的资产(数...
ICCV 2023 Papers: Discover cutting-edge research from ICCV 2023, the leading computer vision conference. Stay updated on the latest in computer vision and deep learning, with code included. ⭐ support ...
Croissant is a high-level format for machine learning datasets that brings together four rich layers.
Tools for easing the handoff between AI/ML and App/SRE teams.
[ICLR 2024] Mol-Instructions: A Large-Scale Biomolecular Instruction Dataset for Large Language Models
A curated list of language modeling researches for code and related datasets.
Langtrace 🔍 is an open-source, Open Telemetry based end-to-end observability tool for LLM applications, providing real-time tracing, evaluations and metrics for popular LLMs, LLM frameworks, vectorD...
A curated list of papers that released datasets along with their work
Awesome Chinese LLM: A curated list of Chinese Large Language Model 中文大语言模型数据集和模型资料汇总
Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.
A list of awesome papers and resources of recommender system on large language model (LLM).
A collection of awesome-prompt-datasets, awesome-instruction-dataset, to train ChatLLM such as chatgpt 收录各种各样的指令数据集, 用于训练 ChatLLM 模型。
Repository for organizing datasets and papers used in Open LLM.
A repository for surgical action triplet dataset. Data are videos of laparoscopic cholecystectomy that have been annotated with <instrument, verb, target> labels for every surgical fine-grained activi...
🚀🚀🚀A collection of some awesome public projects about Large Language Model, Vision Foundation Model and AI Generated Content.
This reposiotry is the collection for public 3D LiDAR datasets