Trajectories









Bookmarks

Lowest advantage
episodes
(unexpected failures):

trajectory 4, frame 353
trajectory 4, frame 64
trajectory 7, frame 87
trajectory 3, frame 168
trajectory 2, frame 117
trajectory 1, frame 29
trajectory 6, frame 151
trajectory 5, frame 355
trajectory 8, frame 31
trajectory 4, frame 66
trajectory 4, frame 247
trajectory 4, frame 490
trajectory 5, frame 217
trajectory 7, frame 377
trajectory 3, frame 377
trajectory 3, frame 73

Highest advantage
episodes
(unexpected successes):

trajectory 4, frame 290
trajectory 8, frame 54
trajectory 2, frame 161
trajectory 6, frame 142
trajectory 3, frame 461
trajectory 8, frame 415
trajectory 6, frame 465
trajectory 3, frame 25
trajectory 5, frame 406
trajectory 5, frame 12
trajectory 1, frame 445
trajectory 8, frame 222
trajectory 7, frame 180
trajectory 4, frame 394
trajectory 7, frame 272
trajectory 3, frame 331

Layers


Timeline

frame: 1 policy: next action:
no-op
D
A
W
S
Q
E
fps
advantage
−0.130
value function
18.1

Attribution

Observation Positive attribution Negative attribution

policy logits:

sums of policy logits:


Attribution legend

Click to expand feature
Hover to isolate

1
2
3
4
5
6
7
8
not
shown
residual
(everything
else)

Hotkeys

go backwards
go forwards
toggle play/pause

Select a feature

Feature visualization

zoom in zoom out
fewer patches more patches