Trajectories









Bookmarks

Lowest advantage
episodes
(unexpected failures):

trajectory 4, frame 164
trajectory 4, frame 449
trajectory 1, frame 289
trajectory 8, frame 220
trajectory 3, frame 390
trajectory 8, frame 99
trajectory 1, frame 204
trajectory 7, frame 308
trajectory 6, frame 402
trajectory 8, frame 292
trajectory 2, frame 466
trajectory 5, frame 434
trajectory 8, frame 411
trajectory 6, frame 344
trajectory 2, frame 198
trajectory 3, frame 184

Highest advantage
episodes
(unexpected successes):

trajectory 8, frame 466
trajectory 4, frame 333
trajectory 5, frame 470
trajectory 1, frame 250
trajectory 1, frame 65
trajectory 7, frame 412
trajectory 7, frame 323
trajectory 2, frame 499
trajectory 4, frame 379
trajectory 1, frame 38
trajectory 6, frame 373
trajectory 2, frame 247
trajectory 1, frame 136
trajectory 8, frame 355
trajectory 8, frame 266
trajectory 6, frame 289

Layers


Timeline

frame: 1 policy: next action: B
no-op
A
B
fps
advantage
−0.181
value function
9.61

Attribution

Observation Positive attribution Negative attribution

policy logits:

sums of policy logits:


Attribution legend

Click to expand feature
Hover to isolate

1
2
3
4
5
6
7
8
not
shown
residual
(everything
else)

Hotkeys

go backwards
go forwards
toggle play/pause

Select a feature

Feature visualization

zoom in zoom out
fewer patches more patches