Trajectories









Bookmarks

Lowest advantage
episodes
(unexpected failures):

trajectory 5, frame 443
trajectory 1, frame 6
trajectory 1, frame 304
trajectory 3, frame 345
trajectory 5, frame 32
trajectory 6, frame 461
trajectory 1, frame 438
trajectory 3, frame 212
trajectory 8, frame 464
trajectory 3, frame 494
trajectory 8, frame 63
trajectory 2, frame 438
trajectory 2, frame 132
trajectory 6, frame 350
trajectory 4, frame 421
trajectory 1, frame 104

Highest advantage
episodes
(unexpected successes):

trajectory 7, frame 483
trajectory 1, frame 444
trajectory 3, frame 254
trajectory 1, frame 368
trajectory 3, frame 361
trajectory 7, frame 42
trajectory 1, frame 122
trajectory 2, frame 66
trajectory 2, frame 220
trajectory 7, frame 216
trajectory 2, frame 374
trajectory 1, frame 277
trajectory 3, frame 149
trajectory 4, frame 399
trajectory 5, frame 80
trajectory 6, frame 370

Layers


Timeline

frame: 1 policy: next action:
no-op
A
B
fps
advantage
−0.506
value function
9.74

Attribution

Observation Positive attribution Negative attribution

policy logits:

sums of policy logits:


Attribution legend

Click to expand feature
Hover to isolate

1
2
3
4
5
6
7
8
not
shown
residual
(everything
else)

Hotkeys

go backwards
go forwards
toggle play/pause

Select a feature

Feature visualization

zoom in zoom out
fewer patches more patches