Trajectories

















Bookmarks

Lowest advantage
episodes
(unexpected failures):

trajectory 7, frame 151
trajectory 6, frame 448
trajectory 12, frame 242
trajectory 15, frame 4
trajectory 11, frame 411
trajectory 9, frame 97
trajectory 10, frame 258
trajectory 8, frame 7
trajectory 9, frame 301
trajectory 1, frame 402
trajectory 8, frame 325
trajectory 3, frame 44
trajectory 5, frame 192
trajectory 6, frame 68
trajectory 13, frame 429
trajectory 3, frame 366

Highest advantage
episodes
(unexpected successes):

trajectory 12, frame 249
trajectory 7, frame 509
trajectory 13, frame 462
trajectory 9, frame 322
trajectory 7, frame 152
trajectory 4, frame 251
trajectory 16, frame 273
trajectory 7, frame 65
trajectory 10, frame 439
trajectory 1, frame 369
trajectory 3, frame 304
trajectory 3, frame 13
trajectory 11, frame 6
trajectory 6, frame 453
trajectory 1, frame 32
trajectory 8, frame 377

Layers







Timeline

frame: 1 policy: next action:
no-op
A
B
fps
advantage
−0.129
value function
9.34

Attribution

Observation Positive attribution Negative attribution

policy logits:

sums of policy logits:


Attribution legend

Click to expand feature
Hover to isolate

1
2
3
4
5
6
7
8
not
shown
residual
(everything
else)

Hotkeys

go backwards
go forwards
toggle play/pause

Select a feature

Feature visualization

zoom in zoom out
fewer patches more patches