28 results found Sort:

672
5.9k
other
38
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
Created 2019-06-07
832 commits to master branch, last one about a month ago
PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) ...
Created 2017-08-22
274 commits to master branch, last one 3 years ago
301
3.2k
apache-2.0
28
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
Created 2023-07-30
1,010 commits to main branch, last one 21 hours ago
Minimal implementation of clipped objective Proximal Policy Optimization (PPO) in PyTorch
Created 2018-09-27
98 commits to master branch, last one about a year ago
189
1.1k
mit
27
PyTorch implementation of Deep Reinforcement Learning: Policy Gradient methods (TRPO, PPO, A2C) and Generative Adversarial Imitation Learning (GAIL). Fast Fisher vector product TRPO.
Created 2017-10-17
100 commits to master branch, last one 3 years ago
Proximal Policy Optimization (PPO) algorithm for Super Mario Bros
Created 2019-10-02
9 commits to master branch, last one 3 years ago
This repository contains most of pytorch implementation based classic deep reinforcement learning algorithms, including - DQN, DDQN, Dueling Network, DDPG, SAC, A2C, PPO, TRPO. (More algorithms are st...
Created 2018-01-13
25 commits to master branch, last one 3 years ago
lagom: A PyTorch infrastructure for rapid prototyping of reinforcement learning algorithms.
Created 2017-12-21
703 commits to master branch, last one 5 years ago
Trading Environment(OpenAI Gym) + PPO(TensorForce)
Created 2018-08-25
8 commits to master branch, last one 6 years ago
Recurrent and multi-process PyTorch implementation of deep reinforcement Actor-Critic algorithms A2C and PPO
Created 2019-04-07
25 commits to master branch, last one 2 years ago
Clean baseline implementation of PPO using an episodic TransformerXL memory
Created 2022-05-04
9 commits to main branch, last one 6 months ago
PyTorch implementation of some reinforcement learning algorithms: A2C, PPO, Behavioral Cloning from Observation (BCO), GAIL.
Created 2020-05-04
89 commits to master branch, last one 3 years ago
Proximal Policy Optimization (PPO) algorithm for Contra
Created 2019-09-06
3 commits to master branch, last one 3 years ago
Curiosity-driven Exploration by Self-supervised Prediction
Created 2018-11-23
12 commits to master branch, last one about a year ago
Baseline implementation of recurrent PPO using truncated BPTT
Created 2021-06-07
13 commits to main branch, last one about a year ago
Code for the paper "Reinforced Curriculum Learning for Autonomous Driving in CARLA" (ICIP 2021)
Created 2020-03-12
45 commits to master branch, last one 2 years ago
强化学习算法库,包含了目前主流的强化学习算法(Value based and Policy based)的代码,代码都经过调试并可以运行
Created 2021-12-01
193 commits to main branch, last one about a year ago
3
53
unknown
1
Jax implementation of Proximal Policy Optimization (PPO) specifically tuned for Procgen, with benchmarked results and saved model weights on all environments.
Created 2021-09-03
12 commits to main branch, last one 2 years ago
This repository contains an application using ROS2 Humble, Gazebo, OpenAI Gym and Stable Baselines3 to train reinforcement learning agents for a path planning problem.
Created 2023-02-07
64 commits to humble branch, last one about a year ago
An implementation of Phasic Policy Gradient, a proposed improvement of Proximal Policy Gradients, in Pytorch
Created 2020-09-27
56 commits to master branch, last one 18 days ago
🚗 3D web app that combines Proximal Policy Optimization with Three.js, enabling users to directly interact with or train AI models on a virtual racetrack.
Created 2023-09-21
101 commits to main branch, last one 9 months ago
8
27
mit
3
Quantum error correction code AI-discovery with Jax
Created 2024-02-23
18 commits to main branch, last one 12 days ago