Value-Based Policies¶
keras_gym.policies.EpsilonGreedy |
Value-based policy to select actions using epsilon-greedy strategy. |
-
class
keras_gym.policies.
EpsilonGreedy
(q_function, epsilon=0.1, random_seed=None)[source]¶ Value-based policy to select actions using epsilon-greedy strategy.
Parameters: - q_function : callable
A state-action value function object.
- epsilon : float between 0 and 1
The probability of selecting an action uniformly at random.
- random_seed : int, optional
Sets the random state to get reproducible results.
-
__call__
(self, s)[source]¶ Draw an action from the current policy \(\pi(a|s)\).
Parameters: - s : state observation
A single state observation.
Returns: - a : action
A single action proposed under the current policy.
-
dist_params
(self, s)[source]¶ Get the parameters of the (conditional) probability distribution \(\pi(a|s)\).
Parameters: - s : state observation
A single state observation.
Returns: - params : nd array
An array containing the distribution parameters.