2 results found Sort:

Clean baseline implementation of PPO using an episodic TransformerXL memory
Created 2022-05-04
9 commits to main branch, last one 5 months ago
Baseline implementation of recurrent PPO using truncated BPTT
Created 2021-06-07
13 commits to main branch, last one about a year ago