Debugging Deterministic Simulations
When a deterministic simulation is working, they are very helpful for debugging. You can record and replay them to reliably reproduce all sorts of issues.
But what do you do when the simulation diverges and is no longer deterministic? We dealt with this at Relic Entertainment. Each player in a multiplayer game would simulate their world and transmit their choices to every other player. Everyone would happily simulate lots of things the same way without having to explicitly send all the data over the network.
You’ll want to quickly and efficiently know if your simulation has diverged. This was done by CRCing state data each tick, so it could be compared to other players. This also had a nice side effect of detecting when someone was trying to cheat (changing their local simulation).
So this is a way to know if something went wrong. But how do you know which part exactly? When a divergence would be detected, often the actual issue started a few ticks earlier. To address this, debug builds kept a circular buffer of state data (more detailed than CRCs) for the most recent ticks. These would get dumped if a “sync error” (sim divergence) occurred. With the logs of several clients in the same match, it was possible to narrow down the root issue. The most common causes were: iterating through containers that have non-deterministic ordering (like std::map
entries), floating-point precision issues, and using non-simulation local-to-client data for simulation inputs (like where the player’s camera is looking).