Trajectories









Bookmarks

Lowest advantage
episodes
(unexpected failures):

trajectory 7, frame 200
trajectory 5, frame 5
trajectory 8, frame 132
trajectory 6, frame 376
trajectory 3, frame 160
trajectory 5, frame 460
trajectory 3, frame 37
trajectory 4, frame 44
trajectory 6, frame 98
trajectory 4, frame 395
trajectory 2, frame 120
trajectory 3, frame 268
trajectory 8, frame 322
trajectory 2, frame 27
trajectory 7, frame 352
trajectory 2, frame 370

Highest advantage
episodes
(unexpected successes):

trajectory 8, frame 445
trajectory 7, frame 446
trajectory 1, frame 334
trajectory 7, frame 211
trajectory 7, frame 76
trajectory 5, frame 69
trajectory 7, frame 49
trajectory 6, frame 293
trajectory 3, frame 366
trajectory 2, frame 148
trajectory 4, frame 91
trajectory 6, frame 463
trajectory 6, frame 174
trajectory 2, frame 404
trajectory 2, frame 292
trajectory 7, frame 483

Layers


Timeline

frame: 1 policy: next action:
no-op
A
B
fps
advantage
−0.162
value function
9.67

Attribution

Observation Positive attribution Negative attribution

policy logits:

sums of policy logits:


Attribution legend

Click to expand feature
Hover to isolate

1
2
3
4
5
6
7
8
not
shown
residual
(everything
else)

Hotkeys

go backwards
go forwards
toggle play/pause

Select a feature

Feature visualization

zoom in zoom out
fewer patches more patches