Trajectories









Bookmarks

Lowest advantage
episodes
(unexpected failures):

trajectory 3, frame 478
trajectory 1, frame 80
trajectory 6, frame 110
trajectory 8, frame 168
trajectory 2, frame 60
trajectory 7, frame 291
trajectory 5, frame 54
trajectory 4, frame 349
trajectory 3, frame 504

Highest advantage
episodes
(unexpected successes):

trajectory 8, frame 238
trajectory 2, frame 189
trajectory 6, frame 195
trajectory 5, frame 383
trajectory 7, frame 466
trajectory 1, frame 43
trajectory 4, frame 167
trajectory 3, frame 122
trajectory 3, frame 495

Layers


Timeline

frame: 1 policy: next action: D
no-op
D
A
W
S
Q
E
fps
advantage
−0.0365
value function
19.8

Attribution

Observation Positive attribution Negative attribution

policy logits:

sums of policy logits:


Attribution legend

Click to expand feature
Hover to isolate

1
2
3
4
5
6
7
8
not
shown
residual
(everything
else)

Hotkeys

go backwards
go forwards
toggle play/pause

Select a feature

Feature visualization

zoom in zoom out
fewer patches more patches