Yosh

Bonus video #4 - Memorisation vs Learning

Added 2025-02-20 07:00:06 +0000 UTC

Here's another extra section I cut from the main video to keep it shorter. Once again, it gives more details on how the training works.

The trick I'm describing here (a slight perturbation of the initial conditions to ensure the AI doesn't overfit) is not new. I talked a lot about that in the pipe video, and explained how a very small perturbation is enough to induce completely differents runs, due to the chaotic nature of the game-AI interaction.

But here it was harder to avoid the AI overfitting to a single trajectory: the noseboost trick happens in a very short window of time after the start (about 0.5s = 50 time-steps), which leaves little time for a small perturbation of the initial conditions to have a significant effect.