Trajectories









Bookmarks

Lowest advantage
episodes
(unexpected failures):

trajectory 2, frame 276
trajectory 6, frame 316
trajectory 4, frame 435
trajectory 2, frame 59
trajectory 8, frame 231
trajectory 5, frame 19
trajectory 5, frame 196
trajectory 8, frame 443
trajectory 3, frame 66
trajectory 6, frame 213
trajectory 4, frame 96
trajectory 7, frame 191
trajectory 6, frame 60
trajectory 1, frame 452
trajectory 2, frame 422
trajectory 7, frame 423

Highest advantage
episodes
(unexpected successes):

trajectory 1, frame 497
trajectory 6, frame 331
trajectory 5, frame 336
trajectory 5, frame 477
trajectory 8, frame 391
trajectory 3, frame 42
trajectory 7, frame 43
trajectory 5, frame 366
trajectory 7, frame 219
trajectory 1, frame 49
trajectory 3, frame 262
trajectory 4, frame 33
trajectory 8, frame 6
trajectory 3, frame 461
trajectory 1, frame 418
trajectory 5, frame 289

Layers


Timeline

frame: 1 policy: next action:
no-op
A
B
fps
advantage
−0.152
value function
9.64

Attribution

Observation Positive attribution Negative attribution

policy logits:

sums of policy logits:


Attribution legend

Click to expand feature
Hover to isolate

1
2
3
4
5
6
7
8
not
shown
residual
(everything
else)

Hotkeys

go backwards
go forwards
toggle play/pause

Select a feature

Feature visualization

zoom in zoom out
fewer patches more patches