Search Results - RepositoryStats

1.1k

4.1k

mit

113

A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more

tf mcts keras gobang gomoku alphago othello pytorch alphazero self-play alpha-zero tensorflow alphago-zero deep-learning neural-network reinforcement-learning monte-carlo-tree-search

Created 2017-12-01

221 commits to master branch, last one 2 months ago

DI-engine opendilab

398

3.3k

apache-2.0

22

OpenDILab Decision AI Engine. The Most Comprehensive Reinforcement Learning Framework B.P.

Created 2021-07-04

850 commits to main branch, last one 3 days ago

LightZero opendilab

145

1.3k

apache-2.0

11

[NeurIPS 2023 Spotlight] LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios (awesome MCTS)

Created 2022-10-08

202 commits to main branch, last one 2 days ago

DI-star opendilab

119

1.3k

apache-2.0

17

An artificial intelligence platform for the StarCraft II with large-scale distributed training and grand-master agents.

league self-play starcraft2 deep-learning reinforcment-learning artificial-intelligence deep-reinforcement-learning

Created 2021-07-04

72 commits to main branch, last one 7 days ago

SPIN uclaml

99

1.1k

apache-2.0

12

The official implementation of Self-Play Fine-Tuning (SPIN)

self-play fine-tuning deep-learning large-language-models

Created 2024-02-04

100 commits to main branch, last one 10 months ago

SPPO uclaml

47

502

apache-2.0

28

The official implementation of Self-Play Preference Optimization (SPPO)

rlhf self-play fine-tuning deep-learning large-language-models

Created 2024-06-13

28 commits to main branch, last one about a month ago

TimeChamber inspirai

35

335

mit

10

A Massively Parallel Large Scale Self-Play Framework

isaac-gym self-play multi-agent reinforcement-learning deep-reinforcement-learning

Created 2022-08-17

39 commits to main branch, last one 2 years ago

gym-continuousDoubleAuction ChuaCheowHuan

31

144

mit

7

A custom MARL (multi-agent reinforcement learning) environment where multiple agents trade against one another (self-play) in a zero-sum continuous double auction. Ray [RLlib] is used for training.

ppo ray lstm marl rllib n-player zero-sum self-play double-auction zero-sum-games gym-environment limit-order-book quantitative-finance quantitative-trading financial-engineering market-microstructure high-frequency-trading multi-agent-reinforcement-learning

Created 2019-07-20

309 commits to master branch, last one 4 years ago

osrs-pvp-reinforcement-learning Naton1

30

101

unknown

4

Train a neural network to PvP in Old School RuneScape using reinforcement learning.

gym ppo java osrs rsps python pytorch runescape self-play deep-learning machine-learning oldschool-runescape reinforcement-learning artificial-intelligence

Created 2024-01-16

572 commits to master branch, last one about a year ago

alpha-zero-general cestpasphoto

15

46

mit

2

A very fast implementation of AlphaZero, applied to games like Splendor, Santorini, The Little Prince, … Browser version available

numba python alphago pytorch splendor alphazero machikoro santorini self-play minivilles alphago-zero santorini-game the-little-prince reinforcement-learning

Created 2021-02-10

656 commits to master branch, last one 2 months ago

td-gammon dellalibera

13

45

mit

2

TD-Gammon implementation

game pytorch self-play backgammon neural-network value-function reinforcement-learning artificial-intelligence convolutional-neural-networks temporal-differencing-learning

Created 2019-09-02

21 commits to master branch, last one 5 years ago