Search Results - RepositoryStats

Multi-Agent-Constrained-Policy-Optimisation chauncygu

26

164

other

2

Multi-Agent Constrained Policy Optimisation (MACPO; MAPPO-L).

policy-optimization safe-reinforcement-learning multi-agent-reinforcement-learning

Created 2021-10-06

71 commits to main branch, last one 11 months ago

policy_optimization liziniu

5

27

unknown

1

Code for Paper (Policy Optimization in RLHF: The Impact of Out-of-preference Data)

rlhf bandit policy-optimization large-language-models stochastic-approximation

Created 2023-12-12

1 commits to master branch, last one about a year ago

no-representation-no-trust CLAIRE-Labo

2

25

mit

3

Codebase to fully reproduce the results of "No Representation, No Trust: Connecting Representation, Collapse, and Trust Issues in PPO" (Moalla et al. 2024). Uses TorchRL and provides extensive tools f...

deep-learning policy-optimization reinforcement-learning

Created 2024-04-30

5 commits to main branch, last one 4 months ago