I have already worked with AB tests, but it was frequentist AB tests.
Recently, for an R&D project, I had to implement Bayesian AB tests. As AB tests are an important key to develop safely and surely, I decided to present to you what I’ve learnt so far. I focused my reasearch on the Pymc3 library.
It’s a very nice and brief introduction to Bayesian AB tests.
In Bayesian tests, you must set a previous distribution. It’s your priors, what you already know.
Here you can find an example of Bayesian AB tests with Pymc3. Beyond that, the advantages of Bayesian AB tests are explained.
Bayesian A/B Testing employs Bayesian inference methods to give you ‘probability’ of how much B is better (or worse) than A.
The immediate advantage of this method is that we can understand the result intuitively even without understanding what p-value or null hypothesis means. This means that it’s easier to communicate with our business stakeholders in a language that makes sense — the language of risk and value.
Another advantage is that since Bayesian statistics don’t care for statistical significance, you don’t have to worry too much about the test size when you evaluate the result. You can start evaluating the effect from day one by reading the probability of B being better than A. Of course, as we get more data our answers will be more accurate, but since we are using the language of probabilities, we are able to say, for example, “A is better than B with 60% probability” rather than “We don’t have enough data” So you can decide if you want to wait any longer.
I had difficulties understanding how to interpret the results with the precedent link. Then, I found this new link. It helped me a lot.
I decided to mix the code of the 2 precedent links to get a small implementation. In this repository, you can find a Jupyter notebook and a Databricks notebook that implement an example of Bayesian AB tests.
That’s not enough to get something very robust in production. For instance, this solution doesn’t handle big data. It is more something for an R&D project as mentioned early. But the idea can be very valuable for safe productions. This is why I wanted to highlight AB tests and Bayesian AB tests.
I have recently self-published a book about Machine Learning in production. I compiled all the things I have learnt so far. The book is written in French.
Thank you for reading. Feel free to contact me on Twitter if you want to discuss that.