In RL we often make use of data caching. This might be short-term caching, over the course of an episode, or it might be long-term caching as is done in experience replay.
Our short-term caching objects allow us to cache experience within an episode.
caches all transitions collected over an entire episode and then gives us back
the the \(\gamma\)-discounted returns when the episode
Another short-term caching object is
NStepCache, which keeps an \(n\)-sized sliding window
of transitions that allows us to do \(n\)-step bootstrapping.
Experience Replay Buffer¶
At the moment, we only have one long-term caching object, which is the
This object can hold an arbitrary number of transitions; the only constraint is
the amount of available memory on your machine.