Trajectories









Bookmarks

Lowest advantage
episodes
(unexpected failures):

trajectory 1, frame 358
trajectory 2, frame 204
trajectory 6, frame 392
trajectory 4, frame 197
trajectory 5, frame 307
trajectory 6, frame 1
trajectory 7, frame 63
trajectory 5, frame 394
trajectory 6, frame 230
trajectory 5, frame 500
trajectory 3, frame 330
trajectory 5, frame 185
trajectory 3, frame 421
trajectory 4, frame 1
trajectory 3, frame 96
trajectory 8, frame 428

Highest advantage
episodes
(unexpected successes):

trajectory 1, frame 435
trajectory 4, frame 131
trajectory 7, frame 164
trajectory 2, frame 445
trajectory 6, frame 330
trajectory 1, frame 53
trajectory 4, frame 406
trajectory 4, frame 36
trajectory 5, frame 61
trajectory 8, frame 163
trajectory 5, frame 9
trajectory 3, frame 489
trajectory 3, frame 229
trajectory 1, frame 131
trajectory 6, frame 462
trajectory 8, frame 473

Layers














Timeline

frame: 1 policy: next action: A
no-op
A
B
fps
advantage
0.269
value function
9.44

Attribution

Observation Positive attribution Negative attribution

policy logits:

sums of policy logits:

Attribution legend

Click to expand feature

1
2
3
4
5
6
7
8

Hotkeys

go backwards
go forwards
toggle play/pause

Select a feature

Feature visualization

zoom in zoom out
fewer patches more patches