Trajectories









Bookmarks

Lowest advantage
episodes
(unexpected failures):

trajectory 8, frame 413
trajectory 4, frame 410
trajectory 7, frame 90
trajectory 8, frame 121
trajectory 6, frame 79
trajectory 1, frame 33
trajectory 3, frame 301
trajectory 5, frame 281
trajectory 4, frame 247
trajectory 1, frame 483
trajectory 6, frame 251
trajectory 6, frame 171
trajectory 5, frame 465
trajectory 2, frame 62
trajectory 1, frame 173
trajectory 3, frame 75

Highest advantage
episodes
(unexpected successes):

trajectory 7, frame 96
trajectory 4, frame 503
trajectory 6, frame 106
trajectory 8, frame 260
trajectory 3, frame 105
trajectory 1, frame 450
trajectory 6, frame 470
trajectory 7, frame 139
trajectory 2, frame 408
trajectory 4, frame 161
trajectory 3, frame 471
trajectory 4, frame 23
trajectory 8, frame 288
trajectory 7, frame 344
trajectory 6, frame 254
trajectory 3, frame 17

Layers


Timeline

frame: 1 policy: next action:
no-op
D
A
W
S
Q
E
fps
advantage
−0.0553
value function
20.3

Attribution

Observation Positive attribution Negative attribution

policy logits:

sums of policy logits:


Attribution legend

Click to expand feature
Hover to isolate

1
2
3
4
5
6
7
8
not
shown
residual
(everything
else)

Hotkeys

go backwards
go forwards
toggle play/pause

Select a feature

Feature visualization

zoom in zoom out
fewer patches more patches