Trajectories









Bookmarks

Lowest advantage
episodes
(unexpected failures):

trajectory 7, frame 431
trajectory 7, frame 101
trajectory 4, frame 223
trajectory 7, frame 506
trajectory 8, frame 271
trajectory 1, frame 261
trajectory 5, frame 266
trajectory 6, frame 290
trajectory 6, frame 346
trajectory 1, frame 456
trajectory 8, frame 403
trajectory 1, frame 171
trajectory 5, frame 21
trajectory 6, frame 109
trajectory 1, frame 58
trajectory 4, frame 313

Highest advantage
episodes
(unexpected successes):

trajectory 7, frame 435
trajectory 7, frame 389
trajectory 5, frame 455
trajectory 3, frame 237
trajectory 1, frame 98
trajectory 7, frame 46
trajectory 6, frame 300
trajectory 5, frame 407
trajectory 4, frame 70
trajectory 6, frame 100
trajectory 8, frame 402
trajectory 7, frame 334
trajectory 5, frame 163
trajectory 5, frame 44
trajectory 7, frame 137
trajectory 2, frame 32

Layers


Timeline

frame: 1 policy: next action:
no-op
A
B
fps
advantage
−0.0106
value function
9.71

Attribution

Observation Positive attribution Negative attribution

policy logits:

sums of policy logits:


Attribution legend

Click to expand feature
Hover to isolate

1
2
3
4
5
6
7
8
not
shown
residual
(everything
else)

Hotkeys

go backwards
go forwards
toggle play/pause

Select a feature

Feature visualization

zoom in zoom out
fewer patches more patches