Trajectories









Bookmarks

Lowest advantage
episodes
(unexpected failures):

trajectory 7, frame 108
trajectory 1, frame 138
trajectory 5, frame 293
trajectory 6, frame 318
trajectory 8, frame 28
trajectory 1, frame 452
trajectory 3, frame 378
trajectory 1, frame 370
trajectory 3, frame 246
trajectory 6, frame 51
trajectory 2, frame 193
trajectory 5, frame 438
trajectory 7, frame 334
trajectory 7, frame 6
trajectory 4, frame 273
trajectory 5, frame 92

Highest advantage
episodes
(unexpected successes):

trajectory 7, frame 418
trajectory 3, frame 419
trajectory 5, frame 263
trajectory 4, frame 487
trajectory 5, frame 395
trajectory 2, frame 25
trajectory 6, frame 361
trajectory 7, frame 67
trajectory 1, frame 35
trajectory 3, frame 337
trajectory 4, frame 393
trajectory 7, frame 234
trajectory 6, frame 432
trajectory 5, frame 112
trajectory 6, frame 231
trajectory 6, frame 210

Layers


Timeline

frame: 1 policy: next action:
no-op
A
B
fps
advantage
−0.168
value function
9.66

Attribution

Observation Positive attribution Negative attribution

policy logits:

sums of policy logits:


Attribution legend

Click to expand feature
Hover to isolate

1
2
3
4
5
6
7
8
not
shown
residual
(everything
else)

Hotkeys

go backwards
go forwards
toggle play/pause

Select a feature

Feature visualization

zoom in zoom out
fewer patches more patches