Trajectories









Bookmarks

Lowest advantage
episodes
(unexpected failures):

trajectory 2, frame 234
trajectory 7, frame 463
trajectory 4, frame 500
trajectory 1, frame 498
trajectory 5, frame 499
trajectory 3, frame 500
trajectory 6, frame 495
trajectory 8, frame 491
trajectory 2, frame 395
trajectory 7, frame 511

Highest advantage
episodes
(unexpected successes):

trajectory 1, frame 260
trajectory 8, frame 260
trajectory 6, frame 257
trajectory 5, frame 256
trajectory 2, frame 496
trajectory 4, frame 253
trajectory 7, frame 260
trajectory 3, frame 256
trajectory 2, frame 133
trajectory 7, frame 500

Layers


Timeline

frame: 1 policy: next action: E
no-op
D
A
W
S
Q
E
fps
advantage
0.00948
value function
24.3

Attribution

Observation Positive attribution Negative attribution

policy logits:

sums of policy logits:


Attribution legend

Click to expand feature
Hover to isolate

1
2
3
4
5
6
7
8
not
shown
residual
(everything
else)

Hotkeys

go backwards
go forwards
toggle play/pause

Select a feature

Feature visualization

zoom in zoom out
fewer patches more patches