Trajectories









Bookmarks

Lowest advantage
episodes
(unexpected failures):

trajectory 3, frame 409
trajectory 8, frame 293
trajectory 5, frame 384
trajectory 7, frame 94
trajectory 6, frame 263
trajectory 6, frame 475
trajectory 2, frame 305
trajectory 8, frame 490
trajectory 3, frame 30
trajectory 4, frame 149
trajectory 1, frame 237
trajectory 1, frame 105
trajectory 4, frame 17
trajectory 8, frame 47
trajectory 7, frame 385
trajectory 1, frame 396

Highest advantage
episodes
(unexpected successes):

trajectory 8, frame 296
trajectory 4, frame 345
trajectory 4, frame 426
trajectory 3, frame 489
trajectory 4, frame 474
trajectory 5, frame 397
trajectory 1, frame 303
trajectory 8, frame 142
trajectory 2, frame 49
trajectory 6, frame 387
trajectory 6, frame 354
trajectory 8, frame 5
trajectory 2, frame 162
trajectory 4, frame 13
trajectory 2, frame 466
trajectory 5, frame 200

Layers








Timeline

frame: 1 policy: next action: no-op
no-op
A
B
fps
advantage
0.228
value function
9.49

Attribution

Observation Positive attribution Negative attribution

policy logits:

sums of policy logits:

Attribution legend

Click to expand feature

1
2
3
4
5
6
7
8

Hotkeys

go backwards
go forwards
toggle play/pause

Select a feature

Feature visualization

zoom in zoom out
fewer patches more patches