Trajectories









Bookmarks

Lowest advantage
episodes
(unexpected failures):

trajectory 6, frame 354
trajectory 3, frame 311
trajectory 4, frame 214
trajectory 7, frame 481
trajectory 7, frame 169
trajectory 2, frame 372
trajectory 4, frame 415
trajectory 1, frame 366
trajectory 5, frame 179
trajectory 3, frame 40
trajectory 8, frame 16
trajectory 6, frame 152
trajectory 2, frame 88
trajectory 8, frame 468
trajectory 8, frame 200
trajectory 4, frame 269

Highest advantage
episodes
(unexpected successes):

trajectory 1, frame 466
trajectory 3, frame 363
trajectory 4, frame 233
trajectory 8, frame 261
trajectory 7, frame 111
trajectory 4, frame 484
trajectory 7, frame 221
trajectory 1, frame 35
trajectory 3, frame 95
trajectory 1, frame 124
trajectory 7, frame 505
trajectory 2, frame 324
trajectory 4, frame 496
trajectory 3, frame 400
trajectory 5, frame 485
trajectory 7, frame 384

Layers


Timeline

frame: 1 policy: next action:
no-op
D
A
W
S
Q
E
fps
advantage
0.125
value function
10.4

Attribution

Observation Positive attribution Negative attribution

policy logits:

sums of policy logits:


Attribution legend

Click to expand feature
Hover to isolate

1
2
3
4
5
6
7
8
not
shown
residual
(everything
else)

Hotkeys

go backwards
go forwards
toggle play/pause

Select a feature

Feature visualization

zoom in zoom out
fewer patches more patches