1 result found Sort:
PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) ...
Created
2017-08-22
274 commits to master branch, last one 3 years ago