Trending repositories for topic reinforcement-learning
🧑🏫 60 Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gan...
Code for "Hierarchical World Models as Visual Whole-Body Humanoid Controllers"
FinGPT: Open-Source Financial Large Language Models! Revolutionize 🔥 We release the trained model on HuggingFace.
An Easy-to-use, Scalable and High-performance RLHF Framework (Support 70B+ full tuning & LoRA & Mixtral & KTO)
Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.
《李宏毅深度学习教程》(李宏毅老师推荐👍),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
🔥 A tool for visualizing and tracking your machine learning experiments. This repo contains the CLI and Python API.
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
A curated list of Artificial Intelligence (AI) courses, books, video lectures and papers.
Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Code for "Hierarchical World Models as Visual Whole-Body Humanoid Controllers"
A list of awesome and popular robot learning environments
applying multi-agent reinforcement learning for highway-merging autonomous vehicles
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model.
Evaluating and reproducing real-world robot manipulation policies (e.g., RT-1, RT-1-X, Octo) in simulation under common setups (e.g., Google Robot, WidowX+Bridge)
[ICLR 2024] The official implementation of "Safe Offline Reinforcement Learning with Feasibility-Guided Diffusion Model"
Auxiliary code for pulling, loading reinforcement learning models based on DI-engine from the Huggingface Hub, or pushing them onto Huggingface Hub with auto-created model card.
This repo implements our paper, "Learning to Search Feasible and Infeasible Regions of Routing Problems with Flexible Neural k-Opt", which has been accepted at NeurIPS 2023.
ICLR 2024: SafeDreamer: Safe Reinforcement Learning with World Models
AAAI 2024 Papers: Explore a comprehensive collection of innovative research papers presented at one of the premier artificial intelligence conferences. Seamlessly integrate code implementations for be...
[NeurIPS 2023] DeepACO: Neural-enhanced Ant Systems for Combinatorial Optimization
PyTorch implementation of the implicit Q-learning algorithm (IQL)
PsyDI: A MBTI agent that helps you understand your personality type through a relaxed multi-modal interaction.
An Easy-to-use, Scalable and High-performance RLHF Framework (Support 70B+ full tuning & LoRA & Mixtral & KTO)
A PyTorch implementation of DeepMind's AlphaZero agent to play Go and Gomoku board games
[ICLR 2024] Code for the paper "Text2Reward: Automated Dense Reward Function Generation for Reinforcement Learning"
🧑🏫 60 Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gan...
FinGPT: Open-Source Financial Large Language Models! Revolutionize 🔥 We release the trained model on HuggingFace.
An Easy-to-use, Scalable and High-performance RLHF Framework (Support 70B+ full tuning & LoRA & Mixtral & KTO)
《李宏毅深度学习教程》(李宏毅老师推荐👍),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model.
Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.
Code for "Hierarchical World Models as Visual Whole-Body Humanoid Controllers"
An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
🔥 A tool for visualizing and tracking your machine learning experiments. This repo contains the CLI and Python API.
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
A curated list of Artificial Intelligence (AI) courses, books, video lectures and papers.
Bullet Physics SDK: real-time collision detection and multi-physics simulation for VR, games, visual effects, robotics, machine learning etc.
Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
Code for "Hierarchical World Models as Visual Whole-Body Humanoid Controllers"
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model.
Code for the ICLR 2024 spotlight paper: "Learning to Act without Actions" (introducing Latent Action Policies)
A list of awesome and popular robot learning environments
AAAI 2024 Papers: Explore a comprehensive collection of innovative research papers presented at one of the premier artificial intelligence conferences. Seamlessly integrate code implementations for be...
[TMI'22] "AADG: Automatic Augmentation for Domain Generalization on Retinal Image Segmentation".
PsyDI: A MBTI agent that helps you understand your personality type through a relaxed multi-modal interaction.
Evaluating and reproducing real-world robot manipulation policies (e.g., RT-1, RT-1-X, Octo) in simulation under common setups (e.g., Google Robot, WidowX+Bridge)
An Easy-to-use, Scalable and High-performance RLHF Framework (Support 70B+ full tuning & LoRA & Mixtral & KTO)
Auxiliary code for pulling, loading reinforcement learning models based on DI-engine from the Huggingface Hub, or pushing them onto Huggingface Hub with auto-created model card.
[ICLR 2024] The official implementation of "Safe Offline Reinforcement Learning with Feasibility-Guided Diffusion Model"
applying multi-agent reinforcement learning for highway-merging autonomous vehicles
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model.
Code for "Hierarchical World Models as Visual Whole-Body Humanoid Controllers"
🧑🏫 60 Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gan...
FinGPT: Open-Source Financial Large Language Models! Revolutionize 🔥 We release the trained model on HuggingFace.
Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
《李宏毅深度学习教程》(李宏毅老师推荐👍),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
An Easy-to-use, Scalable and High-performance RLHF Framework (Support 70B+ full tuning & LoRA & Mixtral & KTO)
Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
A curated list of Artificial Intelligence (AI) courses, books, video lectures and papers.
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
🔥 A tool for visualizing and tracking your machine learning experiments. This repo contains the CLI and Python API.
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model.
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
A curated list of reinforcement learning with human feedback resources (continually updated)
Bullet Physics SDK: real-time collision detection and multi-physics simulation for VR, games, visual effects, robotics, machine learning etc.
Evaluating and reproducing real-world robot manipulation policies (e.g., RT-1, RT-1-X, Octo) in simulation under common setups (e.g., Google Robot, WidowX+Bridge)
Code for "Hierarchical World Models as Visual Whole-Body Humanoid Controllers"
PsyDI: A MBTI agent that helps you understand your personality type through a relaxed multi-modal interaction.
Code for the ICLR 2024 spotlight paper: "Learning to Act without Actions" (introducing Latent Action Policies)
Auxiliary code for pulling, loading reinforcement learning models based on DI-engine from the Huggingface Hub, or pushing them onto Huggingface Hub with auto-created model card.
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
AAAI 2024 Papers: Explore a comprehensive collection of innovative research papers presented at one of the premier artificial intelligence conferences. Seamlessly integrate code implementations for be...
An Easy-to-use, Scalable and High-performance RLHF Framework (Support 70B+ full tuning & LoRA & Mixtral & KTO)
The TTCP CAGE Challenges are a series of public challenges instigated to foster the development of autonomous cyber defensive agents. This CAGE Challenge 4 (CC4) returns to a defence industry enterpri...
🏛️A research-friendly codebase for fast experimentation of single-agent reinforcement learning in JAX • End-to-End JAX RL
A PyTorch implementation of DeepMind's AlphaZero agent to play Go and Gomoku board games
ICLR 2024: SafeDreamer: Safe Reinforcement Learning with World Models
An Easy-to-use, Scalable and High-performance RLHF Framework (Support 70B+ full tuning & LoRA & Mixtral & KTO)
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
A collection of recent papers on building autonomous agent. Two topics included: RL-based / LLM-based agents.
Paper collection on building and evaluating language model agents via executable language grounding
An extensive library of AI resources including books, courses, papers, guides, articles, tutorials, notebooks, AI field advancements and more.
AI molecular design tool for de novo design, scaffold hopping, R-group replacement, linker design and molecule optimization.
Code for "TD-MPC2: Scalable, Robust World Models for Continuous Control"
AAAI 2024 Papers: Explore a comprehensive collection of innovative research papers presented at one of the premier artificial intelligence conferences. Seamlessly integrate code implementations for be...
AlignProp uses direct reward backpropogation for the alignment of large-scale text-to-image diffusion models. Our method is 25x more sample and compute efficient than reinforcement learning methods (P...
A Comprehensive Survey of Forgetting in Deep Learning Beyond Continual Learning. arXiv:2307.09218.
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model.
JAX-accelerated Meta-Reinforcement Learning Environments Inspired by XLand and MiniGrid 🏎️
A Fast, Portable Deep Reinforcement Learning Library for Continuous Control
🧑🏫 60 Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gan...
FinGPT: Open-Source Financial Large Language Models! Revolutionize 🔥 We release the trained model on HuggingFace.
Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.
An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
🔥 A tool for visualizing and tracking your machine learning experiments. This repo contains the CLI and Python API.
《李宏毅深度学习教程》(李宏毅老师推荐👍),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
A curated list of Artificial Intelligence (AI) courses, books, video lectures and papers.
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
强化学习中文教程(蘑菇书🍄),在线阅读地址:https://datawhalechina.github.io/easy-rl/
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement lea...
This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."
A curated list of reinforcement learning with human feedback resources (continually updated)
A modern Anki custom scheduling based on Free Spaced Repetition Scheduler algorithm
An Easy-to-use, Scalable and High-performance RLHF Framework (Support 70B+ full tuning & LoRA & Mixtral & KTO)
Summary of key papers and blogs about diffusion models to learn about the topic. Detailed list of all published diffusion robotics papers.
A deep reinforcement learning (DRL) based approach for spatial layout of land use and roads in urban communities. (Nature Computational Science)
FinGPT: Open-Source Financial Large Language Models! Revolutionize 🔥 We release the trained model on HuggingFace.
General Optimal control Problem Solver (GOPS), an easy-to-use PyTorch reinforcement learning solver package for industrial control.
Code for Paper (ReMax: A Simple, Efficient and Effective Reinforcement Learning Method for Aligning Large Language Models)
Paper collection on building and evaluating language model agents via executable language grounding
A Pythonic microframework for multi-armed bandit problems
[NeurIPS 2023] DeepACO: Neural-enhanced Ant Systems for Combinatorial Optimization
A PyTorch library for all things Reinforcement Learning (RL) for Combinatorial Optimization (CO)
Implementation of Robust Imitation Learning against Variations in Environment Dynamics
:battery: Datasets with baselines for offline multi-agent reinforcement learning.
Code to reproduce the experiments in Sample Efficient Reinforcement Learning via Model-Ensemble Exploration and Exploitation (MEEE).
Tactics2D: A Reinforcement Learning Environment Library with Generative Scenarios for Driving Decision-making
Generating sets of formulaic alpha (predictive) stock factors via reinforcement learning.
This repository collects some codes that encapsulates commonly used algorithms in the field of machine learning. Most of them are based on Numpy, Pandas or Torch. You can deepen your understanding to ...
[CVPR 2024] Official code for EgoGen: An Egocentric Synthetic Data Generator