Statistics for topic reinforcement-learning
RepositoryStats tracks 595,858 Github repositories, of these 1,337 are tagged with the reinforcement-learning topic. The most common primary language for repositories using this topic is Python (904). Other languages include: Jupyter Notebook (179), C++ (47), C# (16), Julia (11), HTML (11)
Stargazers over time for topic reinforcement-learning
Most starred repositories for topic reinforcement-learning (view more)
Trending repositories for topic reinforcement-learning (view more)
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga...
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
A framework for creating rich, 3D, Minecraft-like single and multi-agent environments for AI research based on Minetest
A Large Recurrent Action Model: xLSTM enables Fast Inference for Robotics Tasks
PyTorch implementation of Constrained Reinforcement Learning for Soft Actor Critic Algorithm
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
[NAACL 2024] Making Language Models Better Tool Learners with Execution Feedback
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga...
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
A framework for creating rich, 3D, Minecraft-like single and multi-agent environments for AI research based on Minetest
A Large Recurrent Action Model: xLSTM enables Fast Inference for Robotics Tasks
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
Deep reinforcement learning without experience replay, target networks, or batch updates.
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga...
Implementation of papers in 100 lines of code.
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
Deep reinforcement learning without experience replay, target networks, or batch updates.
A deep reinforcement learning framework for generating formulaic alpha factors for quantitative investment, powered by GFlowNet, implemented in Python&PyTorch.
Implementation of papers in 100 lines of code.
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model. NeurIPS 2024 Spotlight.
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga...
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
《李宏毅深度学习教程》(李宏毅老师推荐👍,苹果书🍎),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
Official Implementation for "In-Context Reinforcement Learning for Variable Action Spaces"
[CVPR 2024] Official code for EgoGen: An Egocentric Synthetic Data Generator