Trajectories









Bookmarks

Lowest advantage
episodes
(unexpected failures):

trajectory 8, frame 508
trajectory 7, frame 47
trajectory 8, frame 232
trajectory 4, frame 193
trajectory 2, frame 441
trajectory 6, frame 232
trajectory 3, frame 202
trajectory 7, frame 460
trajectory 1, frame 7
trajectory 4, frame 427
trajectory 3, frame 475
trajectory 1, frame 504
trajectory 5, frame 135
trajectory 2, frame 115
trajectory 5, frame 436
trajectory 6, frame 465

Highest advantage
episodes
(unexpected successes):

trajectory 6, frame 221
trajectory 8, frame 242
trajectory 2, frame 270
trajectory 7, frame 230
trajectory 2, frame 492
trajectory 3, frame 425
trajectory 4, frame 24
trajectory 4, frame 458
trajectory 1, frame 338
trajectory 5, frame 443
trajectory 7, frame 468
trajectory 6, frame 500
trajectory 3, frame 272
trajectory 5, frame 186
trajectory 1, frame 465

Layers


Timeline

frame: 1 policy: next action:
no-op
D
A
W
S
Q
E
fps
advantage
−0.142
value function
13.9

Attribution

Observation Positive attribution Negative attribution

policy logits:

sums of policy logits:


Attribution legend

Click to expand feature
Hover to isolate

1
2
3
4
5
6
7
8
not
shown
residual
(everything
else)

Hotkeys

go backwards
go forwards
toggle play/pause

Select a feature

Feature visualization

zoom in zoom out
fewer patches more patches