Trending repositories for topic reinforcement-learning
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga...
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention)
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.
Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.
《李宏毅深度学习教程》(李宏毅老师推荐👍,苹果书🍎),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model. NeurIPS 2024 Spotlight.
An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."
Clean, Robust, and Unified PyTorch implementation of popular Deep Reinforcement Learning (DRL) algorithms (Q-learning, Duel DDQN, PER, C51, Noisy DQN, PPO, DDPG, TD3, SAC, ASL)
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
🕹️ A diverse suite of scalable reinforcement learning environments in JAX
推荐/广告/搜索领域工业界经典以及最前沿论文集合。A collection of industry classics and cutting-edge papers in the field of recommendation/advertising/search.
FinGPT: Open-Source Financial Large Language Models! Revolutionize 🔥 We release the trained model on HuggingFace.
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
ICLR 2024: SafeDreamer: Safe Reinforcement Learning with World Models
A PyTorch implementation of DeepMind's AlphaZero agent to play Go and Gomoku board games
adam implements a collection of algorithms for calculating rigid-body dynamics in Jax, CasADi, PyTorch, and Numpy.
Code release for Efficient Planning in a Compact Latent Action Space (ICLR2023) https://arxiv.org/abs/2208.10291.
Rank images using TrueSkill by comparing them against each other in the browser. 🖼📊
[ICLR 2024] The official implementation of "Safe Offline Reinforcement Learning with Feasibility-Guided Diffusion Model"
🕹️ A diverse suite of scalable reinforcement learning environments in JAX
Evaluating and reproducing real-world robot manipulation policies (e.g., RT-1, RT-1-X, Octo) in simulation under common setups (e.g., Google Robot, WidowX+Bridge) (CoRL 2024)
Codes accompanying the paper "Score Regularized Policy Optimization through Diffusion Behavior" (ICLR 2024).
Implementation of Soft Actor Critic and some of its improvements in Pytorch
XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning
An OpenaAIGym-based framework allowing to test hybrid approaches (RL + path planning) for multi-UAV systems that are supposed to provide smart services.
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga...
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention)
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.
Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.
An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
《李宏毅深度学习教程》(李宏毅老师推荐👍,苹果书🍎),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."
FinGPT: Open-Source Financial Large Language Models! Revolutionize 🔥 We release the trained model on HuggingFace.
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model. NeurIPS 2024 Spotlight.
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Clean, Robust, and Unified PyTorch implementation of popular Deep Reinforcement Learning (DRL) algorithms (Q-learning, Duel DDQN, PER, C51, Noisy DQN, PPO, DDPG, TD3, SAC, ASL)
A curated list of reinforcement learning with human feedback resources (continually updated)
推荐/广告/搜索领域工业界经典以及最前沿论文集合。A collection of industry classics and cutting-edge papers in the field of recommendation/advertising/search.
adam implements a collection of algorithms for calculating rigid-body dynamics in Jax, CasADi, PyTorch, and Numpy.
An all-weather, day-and-night, collision avoidance simulator that can be implemented as a digital twin for the autonomous COLREG-compliant navigation of maritime vessels.
Repository containing the code for the paper "Safe Model-Based Reinforcement Learning using Robust Control Barrier Functions". Specifically, an implementation of SAC + Robust Control Barrier Functions...
Codes accompanying the paper "Score Regularized Policy Optimization through Diffusion Behavior" (ICLR 2024).
Implementation of Soft Actor Critic and some of its improvements in Pytorch
Project Code for the paper "Learning Visual Locomotion with Cross-Modal Supervision" (ICRA2023)
Play, learn, solve, and analyze No-Limit Texas Hold Em. Implementation follows from Monte Carlo counter-factual regret minimization over with hierarchical K-means imperfect recall abstractions.
Proximal Policy Optimization (PPO) algorithm using PyTorch to train an agent for a rocket landing task in a custom environment
ICLR 2024: SafeDreamer: Safe Reinforcement Learning with World Models
This repository contains an application using ROS2 Humble, Gazebo, OpenAI Gym and Stable Baselines3 to train reinforcement learning agents for a path planning problem.
XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning
A PyTorch implementation of DeepMind's AlphaZero agent to play Go and Gomoku board games
🕹️ A diverse suite of scalable reinforcement learning environments in JAX
A modified benchmark for designing and controlling 2D Voxel-based Soft Robots
Proximal Policy Optimization (PPO) algorithm using PyTorch to train an agent for a rocket landing task in a custom environment
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga...
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.
Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention)
《李宏毅深度学习教程》(李宏毅老师推荐👍,苹果书🍎),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model. NeurIPS 2024 Spotlight.
[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."
A curated list of Artificial Intelligence (AI) courses, books, video lectures and papers.
FinGPT: Open-Source Financial Large Language Models! Revolutionize 🔥 We release the trained model on HuggingFace.
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
Clean, Robust, and Unified PyTorch implementation of popular Deep Reinforcement Learning (DRL) algorithms (Q-learning, Duel DDQN, PER, C51, Noisy DQN, PPO, DDPG, TD3, SAC, ASL)
Proximal Policy Optimization (PPO) algorithm using PyTorch to train an agent for a rocket landing task in a custom environment
HFTFramework utilized for research on " A reinforcement learning approach to improve the performance of the Avellaneda-Stoikov market-making algorithm "
Official Implementation for the paper "R-AIF: Solving Sparse-Reward Robotic Tasks from Pixels with Active Inference and World Models"
IntersectionZoo: Eco-driving for Benchmarking Multi-Agent Contextual Reinforcement Learning
Train quadruped locomotion using reinforcement learning in Mujoco
Python library for solving reinforcement learning (RL) problems using generative models (e.g. Diffusion Models).
Play, learn, solve, and analyze No-Limit Texas Hold Em. Implementation follows from Monte Carlo counter-factual regret minimization over with hierarchical K-means imperfect recall abstractions.
Multi-Agent Deep Reinforcement Learning (MA-DRL) Routing Simulator for satellite networks
Helpful DoggyBot: Open-World Object Fetching using Legged Robots and Vision-Language Models
[ICML 2024] Code for the paper "Confronting Reward Overoptimization for Diffusion Models: A Perspective of Inductive and Primacy Biases"
Official code for ICML 2024 paper, "RIME: Robust Preference-based Reinforcement Learning with Noisy Preferences" (ICML 2024 Spotlight)
Doosan robotic arm, simulation, control, visualization in Gazebo and ROS2 for Reinforcement Learning.
[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Using DDPG and ConvLSTM to control a drone to avoid obstacle in AirSim
This repository contains an application using ROS2 Humble, Gazebo, OpenAI Gym and Stable Baselines3 to train reinforcement learning agents for a path planning problem.
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model. NeurIPS 2024 Spotlight.
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model. NeurIPS 2024 Spotlight.
AAAI 2024 Papers: Explore a comprehensive collection of innovative research papers presented at one of the premier artificial intelligence conferences. Seamlessly integrate code implementations for be...
Evaluating and reproducing real-world robot manipulation policies (e.g., RT-1, RT-1-X, Octo) in simulation under common setups (e.g., Google Robot, WidowX+Bridge) (CoRL 2024)
🏛️A research-friendly codebase for fast experimentation of single-agent reinforcement learning in JAX • End-to-End JAX RL
Video Diffusion Alignment via Reward Gradients. We improve a variety of video diffusion models such as VideoCrafter, OpenSora, ModelScope and StableVideoDiffusion by finetuning them using various rewa...
UMI on Legs: Making Manipulation Policies Mobile with Manipulation-Centric Whole-body Controllers
JAX-accelerated Meta-Reinforcement Learning Environments Inspired by XLand and MiniGrid 🏎️
Awesome LLM Papers and repos on very comprehensive topics.
[CVPR 2024] Code for the paper "Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model"
This Python-based simulation platform can realistically model various components of the UAV network, including the network layer, MAC layer and physical layer, as well as the UAV mobility model, energ...
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga...
《李宏毅深度学习教程》(李宏毅老师推荐👍,苹果书🍎),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.
FinGPT: Open-Source Financial Large Language Models! Revolutionize 🔥 We release the trained model on HuggingFace.
Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.
An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
A curated list of Artificial Intelligence (AI) courses, books, video lectures and papers.
This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention)
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
强化学习中文教程(蘑菇书🍄),在线阅读地址:https://datawhalechina.github.io/easy-rl/
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.
The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement lea...
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model. NeurIPS 2024 Spotlight.
[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.
An extensive library of AI resources including books, courses, papers, guides, articles, tutorials, notebooks, AI field advancements and more.
Official Implementation for "In-Context Reinforcement Learning for Variable Action Spaces"
Simplifying reinforcement learning for complex game environments
[CVPR 2024] Official code for EgoGen: An Egocentric Synthetic Data Generator
[AAAI 2024] GLOP: Learning Global Partition and Local Construction for Solving Large-scale Routing Problems in Real-time
Top paper collection for stock price prediction, quantitative trading. Covering top conferences and journals like KDD, WWW, CIKM, AAAI, IJCAI, ACL, EMNLP.
UMI on Legs: Making Manipulation Policies Mobile with Manipulation-Centric Whole-body Controllers
[NeurIPS 2024] GenRL: Multimodal foundation world models allow grounding language and video prompts into embodied domains, by turning them into sequences of latent world model states. Latent state seq...
[NeurIPS 2024] ReEvo: Large Language Models as Hyper-Heuristics with Reflective Evolution
Tactics2D: A Reinforcement Learning Environment Library with Generative Scenarios for Driving Decision-making
[RSS 2024]: Expressive Whole-Body Control for Humanoid Robots
Implementation of Soft Actor Critic and some of its improvements in Pytorch
Recall to Imagine, a model-based RL algorithm with superhuman memory. Oral (1.2%) @ ICLR 2024
🚗 This repository offers a ready-to-use training and evaluation environment for conducting various experiments using Deep Reinforcement Learning (DRL) in the CARLA simulator with the help of Stable B...
A framework for creating rich, 3D, Minecraft-like environments for AI research based on Minetest