24 results found Sort:

558
4.8k
other
35
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
Created 2019-06-07
828 commits to master branch, last one 18 days ago
PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) ...
Created 2017-08-22
274 commits to master branch, last one 2 years ago
Minimal implementation of clipped objective Proximal Policy Optimization (PPO) in PyTorch
Created 2018-09-27
98 commits to master branch, last one 6 months ago
184
1.1k
mit
26
PyTorch implementation of Deep Reinforcement Learning: Policy Gradient methods (TRPO, PPO, A2C) and Generative Adversarial Imitation Learning (GAIL). Fast Fisher vector product TRPO.
Created 2017-10-17
100 commits to master branch, last one 3 years ago
Proximal Policy Optimization (PPO) algorithm for Super Mario Bros
Created 2019-10-02
9 commits to master branch, last one 2 years ago
This repository contains most of pytorch implementation based classic deep reinforcement learning algorithms, including - DQN, DDQN, Dueling Network, DDPG, SAC, A2C, PPO, TRPO. (More algorithms are st...
Created 2018-01-13
25 commits to master branch, last one 3 years ago
lagom: A PyTorch infrastructure for rapid prototyping of reinforcement learning algorithms.
Created 2017-12-21
703 commits to master branch, last one 4 years ago
Trading Environment(OpenAI Gym) + PPO(TensorForce)
Created 2018-08-25
8 commits to master branch, last one 5 years ago
Recurrent and multi-process PyTorch implementation of deep reinforcement Actor-Critic algorithms A2C and PPO
Created 2019-04-07
25 commits to master branch, last one about a year ago
Proximal Policy Optimization (PPO) algorithm for Contra
Created 2019-09-06
3 commits to master branch, last one 3 years ago
PyTorch implementation of some reinforcement learning algorithms: A2C, PPO, Behavioral Cloning from Observation (BCO), GAIL.
Created 2020-05-04
89 commits to master branch, last one 2 years ago
Curiosity-driven Exploration by Self-supervised Prediction
Created 2018-11-23
12 commits to master branch, last one about a year ago
Clean baseline implementation of PPO using an episodic TransformerXL memory
Created 2022-05-04
9 commits to main branch, last one 10 days ago
Baseline implementation of recurrent PPO using truncated BPTT
Created 2021-06-07
13 commits to main branch, last one 9 months ago
Code for the paper "Reinforced Curriculum Learning for Autonomous Driving in CARLA" (ICIP 2021)
Created 2020-03-12
45 commits to master branch, last one about a year ago
强化学习算法库,包含了目前主流的强化学习算法(Value based and Policy based)的代码,代码都经过调试并可以运行
Created 2021-12-01
193 commits to main branch, last one 8 months ago
1
47
unknown
1
Jax implementation of Proximal Policy Optimization (PPO) specifically tuned for Procgen, with benchmarked results and saved model weights on all environments.
Created 2021-09-03
12 commits to main branch, last one about a year ago
This repository contains an application using ROS2 Humble, Gazebo, OpenAI Gym and Stable Baselines3 to train reinforcement learning agents for a path planning problem.
Created 2023-02-07
64 commits to humble branch, last one 9 months ago
🚗 3D web app that combines Proximal Policy Optimization with Three.js, enabling users to directly interact with or train AI models on a virtual racetrack.
Created 2023-09-21
101 commits to main branch, last one 3 months ago