Trajectories









Bookmarks

Lowest advantage
episodes
(unexpected failures):

trajectory 6, frame 408
trajectory 7, frame 84
trajectory 8, frame 500
trajectory 5, frame 487
trajectory 2, frame 264
trajectory 5, frame 130
trajectory 1, frame 405
trajectory 6, frame 78
trajectory 5, frame 393
trajectory 3, frame 80
trajectory 7, frame 232
trajectory 1, frame 317
trajectory 6, frame 183
trajectory 1, frame 68
trajectory 4, frame 119
trajectory 5, frame 56

Highest advantage
episodes
(unexpected successes):

trajectory 7, frame 96
trajectory 5, frame 511
trajectory 2, frame 313
trajectory 7, frame 472
trajectory 2, frame 237
trajectory 3, frame 378
trajectory 6, frame 479
trajectory 8, frame 10
trajectory 8, frame 511
trajectory 5, frame 249
trajectory 4, frame 216
trajectory 7, frame 498
trajectory 6, frame 358
trajectory 2, frame 12
trajectory 1, frame 140
trajectory 1, frame 258

Layers


Timeline

frame: 1 policy: next action:
no-op
A
B
fps
advantage
−0.169
value function
9.40

Attribution

Observation Positive attribution Negative attribution

policy logits:

sums of policy logits:


Attribution legend

Click to expand feature
Hover to isolate

1
2
3
4
5
6
7
8
not
shown
residual
(everything
else)

Hotkeys

go backwards
go forwards
toggle play/pause

Select a feature

Feature visualization

zoom in zoom out
fewer patches more patches