Trending repositories for topic datasets
Label Studio is a multi-type data labeling and annotation tool with standardized output format
AKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
csghub-server is the backend server for CSGHub which helps user to manage datasets, modes, and also run Model Inference, Finetune and Application Spaces.
A list of awesome papers and resources of recommender system on large language model (LLM).
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
[TMLR] A curated list of language modeling researches for code and related datasets.
CSGHub is an open-source large model platform just like on-premise version of Hugging Face. You can easily manage models and datasets, deploy model applications and setup model finetune or inference j...
Langtrace 🔍 is an open-source, Open Telemetry based end-to-end observability tool for LLM applications, providing real-time tracing, evaluations and metrics for popular LLMs, LLM frameworks, vectorD...
TorchGeo: datasets, samplers, transforms, and pre-trained models for geospatial data
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
A collection of awesome-prompt-datasets, awesome-instruction-dataset, to train ChatLLM such as chatgpt 收录各种各样的指令数据集, 用于训练 ChatLLM 模型。
Croissant is a high-level format for machine learning datasets that brings together four rich layers.
Securely share and store AI/ML projects as OCI artifacts in your container registry.
Useful links of different content related to AI, Computer Vision, and Robotics.
A list of publicly available datasets with real-time data maintained by the team at bytewax.io
Industrial datasets - datasets for evaluating industrial intrusion detection systems on IPAL.
Datasets for remote sensing images (Paper:Exploring Models and Data for Remote Sensing Image Caption Generation)
csghub-server is the backend server for CSGHub which helps user to manage datasets, modes, and also run Model Inference, Finetune and Application Spaces.
chinese NLP corpus of chinese science fiction, chinese science fiction corpus: Archive of the Ark Plan of Ula Science Fiction Website 乌拉科幻小说网方舟计划存档,中文科幻小说自然语言处理语料库,中文科幻小说文本语料库,中文科幻小说文本数据库,科幻小说语料
Langtrace 🔍 is an open-source, Open Telemetry based end-to-end observability tool for LLM applications, providing real-time tracing, evaluations and metrics for popular LLMs, LLM frameworks, vectorD...
A curated list of peer-reviewed papers on theoretical and practical aspects of drivers' attention used for paper "Attention for Vision-Based Assistive and Automated Driving: A Review of Algorithms and...
OpenABC-D is a large-scale labeled dataset generated by synthesizing open source hardware IPs. This dataset can be used for various graph level prediction problems in chip design.
[ICLR 2024] Mol-Instructions: A Large-Scale Biomolecular Instruction Dataset for Large Language Models
A curated list of datasets, publically available for machine learning research in the area of manufacturing
Label Studio is a multi-type data labeling and annotation tool with standardized output format
A collection of awesome-prompt-datasets, awesome-instruction-dataset, to train ChatLLM such as chatgpt 收录各种各样的指令数据集, 用于训练 ChatLLM 模型。
Croissant is a high-level format for machine learning datasets that brings together four rich layers.
You can find links to data acquisition websites.
[CVPR 2023] The official implementation of CVPR 2023 paper "Human-Art: A Versatile Human-Centric Dataset Bridging Natural and Artificial Scenes"
Securely share and store AI/ML projects as OCI artifacts in your container registry.
This project is a collection of recent research in areas such as new infrastructure and urban computing, including white papers, academic papers, AI lab and dataset etc.
Useful links of different content related to AI, Computer Vision, and Robotics.
🌳 A curated list of ground-truth forest datasets for the machine learning and forestry community.
Label Studio is a multi-type data labeling and annotation tool with standardized output format
AKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
csghub-server is the backend server for CSGHub which helps user to manage datasets, modes, and also run Model Inference, Finetune and Application Spaces.
A list of awesome papers and resources of recommender system on large language model (LLM).
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
[TMLR] A curated list of language modeling researches for code and related datasets.
CSGHub is an open-source large model platform just like on-premise version of Hugging Face. You can easily manage models and datasets, deploy model applications and setup model finetune or inference j...
Langtrace 🔍 is an open-source, Open Telemetry based end-to-end observability tool for LLM applications, providing real-time tracing, evaluations and metrics for popular LLMs, LLM frameworks, vectorD...
TorchGeo: datasets, samplers, transforms, and pre-trained models for geospatial data
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
A collection of awesome-prompt-datasets, awesome-instruction-dataset, to train ChatLLM such as chatgpt 收录各种各样的指令数据集, 用于训练 ChatLLM 模型。
Croissant is a high-level format for machine learning datasets that brings together four rich layers.
Securely share and store AI/ML projects as OCI artifacts in your container registry.
Useful links of different content related to AI, Computer Vision, and Robotics.
A list of publicly available datasets with real-time data maintained by the team at bytewax.io
Industrial datasets - datasets for evaluating industrial intrusion detection systems on IPAL.
Datasets for remote sensing images (Paper:Exploring Models and Data for Remote Sensing Image Caption Generation)
csghub-server is the backend server for CSGHub which helps user to manage datasets, modes, and also run Model Inference, Finetune and Application Spaces.
chinese NLP corpus of chinese science fiction, chinese science fiction corpus: Archive of the Ark Plan of Ula Science Fiction Website 乌拉科幻小说网方舟计划存档,中文科幻小说自然语言处理语料库,中文科幻小说文本语料库,中文科幻小说文本数据库,科幻小说语料
Langtrace 🔍 is an open-source, Open Telemetry based end-to-end observability tool for LLM applications, providing real-time tracing, evaluations and metrics for popular LLMs, LLM frameworks, vectorD...
A curated list of peer-reviewed papers on theoretical and practical aspects of drivers' attention used for paper "Attention for Vision-Based Assistive and Automated Driving: A Review of Algorithms and...
OpenABC-D is a large-scale labeled dataset generated by synthesizing open source hardware IPs. This dataset can be used for various graph level prediction problems in chip design.
[ICLR 2024] Mol-Instructions: A Large-Scale Biomolecular Instruction Dataset for Large Language Models
A curated list of datasets, publically available for machine learning research in the area of manufacturing
Label Studio is a multi-type data labeling and annotation tool with standardized output format
A collection of awesome-prompt-datasets, awesome-instruction-dataset, to train ChatLLM such as chatgpt 收录各种各样的指令数据集, 用于训练 ChatLLM 模型。
Croissant is a high-level format for machine learning datasets that brings together four rich layers.
You can find links to data acquisition websites.
[CVPR 2023] The official implementation of CVPR 2023 paper "Human-Art: A Versatile Human-Centric Dataset Bridging Natural and Artificial Scenes"
Securely share and store AI/ML projects as OCI artifacts in your container registry.
This project is a collection of recent research in areas such as new infrastructure and urban computing, including white papers, academic papers, AI lab and dataset etc.
Useful links of different content related to AI, Computer Vision, and Robotics.
🌳 A curated list of ground-truth forest datasets for the machine learning and forestry community.
Label Studio is a multi-type data labeling and annotation tool with standardized output format
AKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
[TMLR] A curated list of language modeling researches for code and related datasets.
Techniques for deep learning with satellite & aerial imagery
Securely share and store AI/ML projects as OCI artifacts in your container registry.
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
A list of awesome papers and resources of recommender system on large language model (LLM).
A large collection of system log datasets for AI-driven log analytics [ISSRE'23]
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop....
TorchGeo: datasets, samplers, transforms, and pre-trained models for geospatial data
FL Chart is a highly customizable Flutter chart library that supports Line Chart, Bar Chart, Pie Chart, Scatter Chart, and Radar Chart.
CSGHub is an open-source large model platform just like on-premise version of Hugging Face. You can easily manage models and datasets, deploy model applications and setup model finetune or inference j...
Langtrace 🔍 is an open-source, Open Telemetry based end-to-end observability tool for LLM applications, providing real-time tracing, evaluations and metrics for popular LLMs, LLM frameworks, vectorD...
Securely share and store AI/ML projects as OCI artifacts in your container registry.
XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning
A curated list of datasets, publically available for machine learning research in the area of manufacturing
Langtrace 🔍 is an open-source, Open Telemetry based end-to-end observability tool for LLM applications, providing real-time tracing, evaluations and metrics for popular LLMs, LLM frameworks, vectorD...
csghub-server is the backend server for CSGHub which helps user to manage datasets, modes, and also run Model Inference, Finetune and Application Spaces.
[TMLR] A curated list of language modeling researches for code and related datasets.
Multiple datasets for ARC (Abstraction and Reasoning Corpus)
A list of datasets, tools, papers and code related to Deepfakes.
Source available LLM Ops platform and LLM Optimization Studio powered by DSPy.
Industrial datasets - datasets for evaluating industrial intrusion detection systems on IPAL.
A curated list of peer-reviewed papers on theoretical and practical aspects of drivers' attention used for paper "Attention for Vision-Based Assistive and Automated Driving: A Review of Algorithms and...
Code and data for "ConflictBank: A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLM" (NeurIPS 2024 Track Datasets and Benchmarks)
A list of awesome papers and resources of recommender system on large language model (LLM).
A repository for surgical action triplet dataset. Data are videos of laparoscopic cholecystectomy that have been annotated with <instrument, verb, target> labels for every surgical fine-grained activi...
CSGHub is an open-source large model platform just like on-premise version of Hugging Face. You can easily manage models and datasets, deploy model applications and setup model finetune or inference j...
👔IMAGDressing👔: Interactive Modular Apparel Generation for Virtual Dressing
Langtrace 🔍 is an open-source, Open Telemetry based end-to-end observability tool for LLM applications, providing real-time tracing, evaluations and metrics for popular LLMs, LLM frameworks, vectorD...
csghub-server is the backend server for CSGHub which helps user to manage datasets, modes, and also run Model Inference, Finetune and Application Spaces.
Securely share and store AI/ML projects as OCI artifacts in your container registry.
This is the repository for our paper "INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning"
A curated list of Place Recognition methods, datasets, and various algorithms for LiDAR
WACV 2024 Papers: Discover cutting-edge research from WACV 2024, the leading computer vision conference. Stay updated on the latest in computer vision and deep learning, with code included. ⭐ support ...
Official implementation of "Holmes-VAD: Towards Unbiased and Explainable Video Anomaly Detection via Multi-modal LLM"
A repository of datasets paired with rich documentation, data essays, and teaching resources
[ECCV 2024] Official PyTorch Implementation of "How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs"
🎉🎨 Papers, Code, Datasets for Neuroscience and Cognition Science
XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning
Label Studio is a multi-type data labeling and annotation tool with standardized output format
CSGHub is an open-source large model platform just like on-premise version of Hugging Face. You can easily manage models and datasets, deploy model applications and setup model finetune or inference j...
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
AKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
[TMLR] A curated list of language modeling researches for code and related datasets.
Techniques for deep learning with satellite & aerial imagery
An open source multi-tool for exploring and publishing data
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop....
👔IMAGDressing👔: Interactive Modular Apparel Generation for Virtual Dressing
A list of awesome papers and resources of recommender system on large language model (LLM).
A repository that contains models, datasets, and fine-tuning techniques for DB-GPT, with the purpose of enhancing model performance in Text-to-SQL
TorchGeo: datasets, samplers, transforms, and pre-trained models for geospatial data
FL Chart is a highly customizable Flutter chart library that supports Line Chart, Bar Chart, Pie Chart, Scatter Chart, and Radar Chart.
Securely share and store AI/ML projects as OCI artifacts in your container registry.
Langtrace 🔍 is an open-source, Open Telemetry based end-to-end observability tool for LLM applications, providing real-time tracing, evaluations and metrics for popular LLMs, LLM frameworks, vectorD...
[TMLR] A curated list of language modeling researches for code and related datasets.
Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.
A curated list of Place Recognition methods, datasets, and various algorithms for LiDAR
🦄 Unitxt: a python library for getting data fired up and set for training and evaluation
Croissant is a high-level format for machine learning datasets that brings together four rich layers.
Multilingual Large Language Models Evaluation Benchmark
"Enhancing LLM Factual Accuracy with RAG to Counter Hallucinations: A Case Study on Domain-Specific Queries in Private Knowledge-Bases" by Jiarui Li and Ye Yuan and Zehua Zhang
CVPR 2023-2024 Papers: Dive into advanced research presented at the leading computer vision conference. Keep up to date with the latest developments in computer vision and deep learning. Code included...
[ECCV 2024] Official PyTorch Implementation of "How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs"
Resources about solar power systems for data science