Trajectories









Bookmarks

Lowest advantage
episodes
(unexpected failures):

trajectory 4, frame 159
trajectory 4, frame 498
trajectory 3, frame 78
trajectory 1, frame 466
trajectory 6, frame 479
trajectory 8, frame 187
trajectory 6, frame 106
trajectory 5, frame 229
trajectory 7, frame 292
trajectory 7, frame 148
trajectory 3, frame 196
trajectory 2, frame 189
trajectory 3, frame 103
trajectory 1, frame 217
trajectory 6, frame 305
trajectory 7, frame 230

Highest advantage
episodes
(unexpected successes):

trajectory 1, frame 272
trajectory 5, frame 219
trajectory 3, frame 488
trajectory 2, frame 378
trajectory 8, frame 331
trajectory 7, frame 445
trajectory 1, frame 166
trajectory 2, frame 33
trajectory 2, frame 62
trajectory 1, frame 142
trajectory 4, frame 9
trajectory 4, frame 97
trajectory 2, frame 109
trajectory 5, frame 272
trajectory 3, frame 120
trajectory 1, frame 297

Layers


Timeline

frame: 1 policy: next action:
no-op
A
B
fps
advantage
−0.124
value function
9.26

Attribution

Observation Positive attribution Negative attribution

policy logits:

sums of policy logits:


Attribution legend

Click to expand feature
Hover to isolate

1
2
3
4
5
6
7
8
not
shown
residual
(everything
else)

Hotkeys

go backwards
go forwards
toggle play/pause

Select a feature

Feature visualization

zoom in zoom out
fewer patches more patches