I’ve been running some A/B tests on the Distilled website recently. It was the first time I’ve had my hands dirty in the data for a little while and the tests weren’t doing what I was expecting them to do. I found, for example, that winning variants would routinely underperform when we rolled them out to live.

