Trajectories









Bookmarks

Lowest advantage
episodes
(unexpected failures):

trajectory 6, frame 284
trajectory 1, frame 126
trajectory 4, frame 174
trajectory 4, frame 322
trajectory 7, frame 168
trajectory 1, frame 240
trajectory 7, frame 20
trajectory 2, frame 331
trajectory 5, frame 467
trajectory 7, frame 480
trajectory 1, frame 439
trajectory 6, frame 82
trajectory 4, frame 150
trajectory 5, frame 366
trajectory 4, frame 474
trajectory 3, frame 365

Highest advantage
episodes
(unexpected successes):

trajectory 3, frame 78
trajectory 1, frame 85
trajectory 5, frame 317
trajectory 7, frame 82
trajectory 2, frame 188
trajectory 5, frame 376
trajectory 4, frame 42
trajectory 1, frame 488
trajectory 1, frame 409
trajectory 6, frame 386
trajectory 8, frame 237
trajectory 7, frame 176
trajectory 3, frame 446
trajectory 7, frame 332
trajectory 3, frame 122
trajectory 3, frame 330

Layers


Timeline

frame: 1 policy: next action:
no-op
A
B
fps
advantage
0.0769
value function
9.67

Attribution

Observation Positive attribution Negative attribution

policy logits:

sums of policy logits:


Attribution legend

Click to expand feature
Hover to isolate

1
2
3
4
5
6
7
8
not
shown
residual
(everything
else)

Hotkeys

go backwards
go forwards
toggle play/pause

Select a feature

Feature visualization

zoom in zoom out
fewer patches more patches