Trajectories









Bookmarks

Lowest advantage
episodes
(unexpected failures):

trajectory 2, frame 443
trajectory 3, frame 229
trajectory 5, frame 264
trajectory 6, frame 106
trajectory 1, frame 156
trajectory 8, frame 211
trajectory 4, frame 150
trajectory 7, frame 155

Highest advantage
episodes
(unexpected successes):

trajectory 7, frame 102
trajectory 2, frame 261
trajectory 3, frame 504
trajectory 8, frame 283
trajectory 6, frame 477
trajectory 4, frame 112
trajectory 1, frame 327
trajectory 5, frame 96

Layers


Timeline

frame: 1 policy: next action: Q
no-op
D
A
W
S
Q
E
fps
advantage
−0.355
value function
17

Attribution

Observation Positive attribution Negative attribution

policy logits:

sums of policy logits:


Attribution legend

Click to expand feature
Hover to isolate

1
2
3
4
5
6
7
8
not
shown
residual
(everything
else)

Hotkeys

go backwards
go forwards
toggle play/pause

Select a feature

Feature visualization

zoom in zoom out
fewer patches more patches