#StackBounty: #neural-networks #reinforcement-learning How does function approximators make large state-action spaces tractable for lea…

Bounty: 50

How is it that in policy gradient, parametrizing the policy by a Deep Neural Network enables the application of these methods to extremely large state and action space (potentially continuous actions)? How does deep learning (or a function approximate) magically make large state action spaces tractable in the learning procedure? How does it enable this to be tractable? How does it compare to non-neural network methods that wouldn’t be tractable?

So

  1. Why is the state-action space so large in the first place (examples?)
  2. How does the neural network make the state-action action space “small” (or tractable/learnable)?


Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.