Trajectories









Bookmarks

Lowest advantage
episodes
(unexpected failures):

trajectory 8, frame 406
trajectory 5, frame 19
trajectory 6, frame 430
trajectory 2, frame 439
trajectory 8, frame 234
trajectory 4, frame 255
trajectory 2, frame 6
trajectory 7, frame 87
trajectory 7, frame 163
trajectory 8, frame 77
trajectory 6, frame 362
trajectory 1, frame 126
trajectory 7, frame 351
trajectory 7, frame 242
trajectory 4, frame 407
trajectory 6, frame 217

Highest advantage
episodes
(unexpected successes):

trajectory 6, frame 198
trajectory 8, frame 257
trajectory 4, frame 249
trajectory 4, frame 73
trajectory 8, frame 453
trajectory 6, frame 276
trajectory 6, frame 460
trajectory 7, frame 420
trajectory 4, frame 40
trajectory 1, frame 170
trajectory 7, frame 268
trajectory 7, frame 102
trajectory 6, frame 375
trajectory 2, frame 273
trajectory 3, frame 176
trajectory 7, frame 388

Layers


Timeline

frame: 1 policy: next action:
no-op
A
B
fps
advantage
0.0324
value function
9.48

Attribution

Observation Positive attribution Negative attribution

policy logits:

sums of policy logits:


Attribution legend

Click to expand feature
Hover to isolate

1
2
3
4
5
6
7
8
not
shown
residual
(everything
else)

Hotkeys

go backwards
go forwards
toggle play/pause

Select a feature

Feature visualization

zoom in zoom out
fewer patches more patches