Trajectories









Bookmarks

Lowest advantage
episodes
(unexpected failures):

trajectory 2, frame 243
trajectory 3, frame 462
trajectory 7, frame 299
trajectory 1, frame 479
trajectory 1, frame 23
trajectory 8, frame 364
trajectory 7, frame 476
trajectory 5, frame 145
trajectory 6, frame 82
trajectory 3, frame 214
trajectory 1, frame 122
trajectory 8, frame 10
trajectory 4, frame 263
trajectory 1, frame 235
trajectory 7, frame 30
trajectory 5, frame 324

Highest advantage
episodes
(unexpected successes):

trajectory 1, frame 361
trajectory 7, frame 362
trajectory 4, frame 157
trajectory 6, frame 21
trajectory 5, frame 444
trajectory 2, frame 118
trajectory 8, frame 33
trajectory 5, frame 505
trajectory 3, frame 6
trajectory 8, frame 161
trajectory 5, frame 250
trajectory 2, frame 4
trajectory 3, frame 484
trajectory 8, frame 99
trajectory 1, frame 156
trajectory 3, frame 192

Layers


Timeline

frame: 1 policy: next action:
no-op
D
A
W
S
Q
E
fps
advantage
−0.0185
value function
13

Attribution

Observation Positive attribution Negative attribution

policy logits:

sums of policy logits:


Attribution legend

Click to expand feature
Hover to isolate

1
2
3
4
5
6
7
8
not
shown
residual
(everything
else)

Hotkeys

go backwards
go forwards
toggle play/pause

Select a feature

Feature visualization

zoom in zoom out
fewer patches more patches