HI version is available. Content is displayed in original English for accuracy.
In browser PPO training demo, made possible by tinygrad: TinyJit -> WebGPU kernels.
Requires WebGPU.
HI version is available. Content is displayed in original English for accuracy.
Requires WebGPU.
Discussion Sentiment
Analyzed from 90 words in the discussion.
Trending Topics
Discussion (6 Comments)Read Original on HackerNews
I noticed that if you go from training to watch and then back, the training temporarily drop significantly in score.
avg500 -4.6 last 500 episodes
peak 3959.3 best window
roll/s 20.68 20-step avg
progress 4388 562749 episodes
Looks like this is for Linux and Windows, on NetBSD I get this issue :(