Topics
Papers
Links

Graph View

Backlinks

Urban Science
System Opportunities
Games as Rules
Machine Learning
Applications of Reinforcement Learning
Machine Learning and Mathematical Reasoning
DeepSeek
Inverse Reinforcement Learning
A Unified View on Planning and Learning
Importance Sampling
The Exploitation-Exploration Trade-Off
Multi-Agent Reinforcement Learning
Function Approximation in Reinforcement Learning
Characterizing the Decision Problem
Bayesian Statistics

The Library

Search

Reinforcement Learning

Mar 10, 2025, 2 min read

Reinforcement Learning can be thought of as an adversarial process between generating a good policy based on a value function and generating a value function from a given policy. Both converge to the optimum.
Some approaches in RL for learning strategy:
- Transfer Learning
- Imitation Learning
- Competitive Learning - learn how to counter an opponent’s actions in a simple game.
- Continual Learning - the model constantly adapts.

Topics

Reinforcement Learning - Notation Guide
The Setting for Reinforcement Learning
A Unified View on Reinforcement Learning Approaches
The Exploitation-Exploration Trade-Off
Markov Processes in Machine Learning
Backups in Reinforcement Learning
Dynamic Programming for Reinforcement Learning
Monte Carlo Methods in Reinforcement Learning
Temporal Difference Learning
N-step Bootstrapping
Eligibility Traces
A Unified View on Planning and Learning
Decision Time Planning
Function Approximation in Reinforcement Learning
Policy Gradient Methods
Multi-Agent Reinforcement Learning
Distributional Reinforcement Learning
Inverse Reinforcement Learning
Applications of Reinforcement Learning

Papers

¹ proposes DIAYN for Hierarchical RL as a method to learn skills—latent conditioned, consistent policies, without the use of a reward function.
- In order to acquire skills that are useful, we must train the skills so that they maximize coverage over the set of possible behaviors while being distinct enough from each other.
- The paper uses maximum entropy policies to be diverse to obtain a mixture of skills. This method is easily generalizable to other tasks.
  - Different skills specialize in visiting different states.
  - States are used to distinguish skills rather than actions. This way actions that have the same effect are indistinguishable.
  - Skills are diverse and suited for complex tasks.
- The skills can be learnt with supervision as needed.
- Learnt skills can be used for imitating an expert.

Links

Sutton and Barto
Powell
OpenAI Spinning Up - has various modern algorithms

Footnotes

Eyesenbach, Gupta, Ibarz, and Levine (2018) Diversity Is All You Need: Learning Skills Without A Reward Function ↩

Created with Quartz v4.1.0, © 2025

Have an issue Send an issue here

GitHub