1 result found Sort:

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) ...
Created 2017-08-22
274 commits to master branch, last one 2 years ago