资 源 简 介
Statistical inference and planning Markov decision processes and algorithms for reinforcement learning. Some highlights include:
Bayesian estimators including:
- Parametric conjugate distributions (e.g. Dirichlet/Multinomial)
- Non-parametric methods (Gaussian processes, various tree models)
- Approximate Bayesian Computation (ABC)
- Various (problem-specific) Monte-Carlo samplers.
(Approximate) dynamic programming algorithms
- Backwards induction / value iteration
- Policy iteration
- Rollout sampling policy iteration
- Least-Squares Policy Iteration
- Least-Squares Temporal Differences
- Fitted Value / Q - iteration.
Reinforcement learning algorithms:
Stochastic approximators (Q-learning, Sarsa and various generalisations)
Upper-confidence bound algorihtms (UCB/UCRL)
Bayesian algorithms (Thompson sampling, Upper/Lower Bayesian Bound algorithms)
Gradient-based Bellman error minimisation (GBRL)
Example rl-glue