Trajectories









Bookmarks

Lowest advantage
episodes
(unexpected failures):

trajectory 8, frame 206
trajectory 1, frame 300
trajectory 8, frame 432
trajectory 5, frame 369
trajectory 6, frame 501
trajectory 1, frame 38
trajectory 7, frame 184
trajectory 8, frame 125
trajectory 2, frame 345
trajectory 1, frame 180
trajectory 7, frame 130
trajectory 4, frame 214
trajectory 5, frame 49
trajectory 3, frame 180
trajectory 5, frame 178
trajectory 3, frame 165

Highest advantage
episodes
(unexpected successes):

trajectory 8, frame 477
trajectory 7, frame 128
trajectory 3, frame 445
trajectory 1, frame 323
trajectory 3, frame 17
trajectory 7, frame 481
trajectory 3, frame 500
trajectory 6, frame 312
trajectory 1, frame 26
trajectory 2, frame 34
trajectory 5, frame 450
trajectory 8, frame 136
trajectory 2, frame 370
trajectory 1, frame 495
trajectory 8, frame 349
trajectory 5, frame 266

Layers


Timeline

frame: 1 policy: next action:
no-op
A
B
fps
advantage
0.0311
value function
9.76

Attribution

Observation Positive attribution Negative attribution

policy logits:

sums of policy logits:


Attribution legend

Click to expand feature
Hover to isolate

1
2
3
4
5
6
7
8
not
shown
residual
(everything
else)

Hotkeys

go backwards
go forwards
toggle play/pause

Select a feature

Feature visualization

zoom in zoom out
fewer patches more patches