Trajectories









Bookmarks

Lowest advantage
episodes
(unexpected failures):

trajectory 6, frame 395
trajectory 2, frame 440
trajectory 2, frame 22
trajectory 8, frame 265
trajectory 4, frame 397
trajectory 7, frame 380
trajectory 3, frame 322
trajectory 4, frame 328
trajectory 8, frame 506
trajectory 5, frame 496
trajectory 5, frame 437
trajectory 5, frame 16
trajectory 5, frame 47
trajectory 7, frame 113
trajectory 2, frame 136
trajectory 4, frame 482

Highest advantage
episodes
(unexpected successes):

trajectory 5, frame 19
trajectory 7, frame 159
trajectory 8, frame 425
trajectory 4, frame 427
trajectory 8, frame 353
trajectory 2, frame 50
trajectory 5, frame 487
trajectory 2, frame 240
trajectory 7, frame 360
trajectory 2, frame 488
trajectory 4, frame 109
trajectory 6, frame 59
trajectory 6, frame 434
trajectory 4, frame 315
trajectory 5, frame 297
trajectory 4, frame 235

Layers


Timeline

frame: 1 policy: next action:
no-op
A
B
fps
advantage
0.0220
value function
9.79

Attribution

Observation Positive attribution Negative attribution

policy logits:

sums of policy logits:


Attribution legend

Click to expand feature
Hover to isolate

1
2
3
4
5
6
7
8
not
shown
residual
(everything
else)

Hotkeys

go backwards
go forwards
toggle play/pause

Select a feature

Feature visualization

zoom in zoom out
fewer patches more patches