A/B Testing Glossary

A/B testing helps product managers to rapidly grow and optimize digital experiences. Our A/B testing glossary breaks down commonly used terms and concepts used when A/B testing across different channels – mobile, web, server and OTT.

 

Multivariate Testing

What is multivariate testing?

Multivariate (MVT) testing is a technique that modifies multiple variables at once, to determine the highest-performing combination of multiple elements on a single page. Compare this to traditional A/B testing where you’re only changing one variable.

Let’s say you have a promotional banner on your website that says “Save 10% now!” in 36pt font. Perhaps you want to see if changing the font size or the copy on the banner would get more users to click on the promotion. With a straightforward A/B test, you might just change the font size or just change the copy. But what if there is some interaction effect such that only a very large “Save $20 now!” message drive sales but a small “Save $20 now!” or a large “Save 10% now!” doesn’t work as well. This is where multivariate testing would come in.

Multivariate testing will help you understand interaction effects between changing two completely different things.

How Multivariate Testing Works

In traditional A/B testing, you would simply create one variant for each version of the element you want to change. If you wanted to test the color of a button, you would create one variant for each button color you want to test. This means that the more colors you want to test, the number of different variants increases linear. You want to test 5 colors? You’ll have 5 variants.

In multivariate testing, this gets more complicated because you need to account for the combination of changes. Let’s say we want to test the color and the copy of a button. If we have 3 colors and 2 different lines of copy, we’ll end up with 6 variants:

Color 1 + Copy 1
Color 1 + Copy 2
Color 2 + Copy 1
Color 2 + Copy 2
Color 3 + Copy 1
Color 3 + Copy 2

If you want to test 3 different elements (let’s say color, copy, and font size), it gets even more complicated:

Color 1 + Copy 1 + big font
Color 1 + Copy 1 + small font
Color 1 + Copy 2 + big font
Color 1 + Copy 2 + small font
Color 2 + Copy 1 + big font
Color 2 + Copy 1 + small font
Color 2 + Copy 2 + big font
Color 2 + Copy 2 + small font
Color 3 + Copy 1 + big font
Color 3 + Copy 1 + small font
Color 3 + Copy 2 + big font
Color 3 + Copy 2 + small font

The number of variants you need to have can be found with this formula:

[# of versions of element 1] x [# of versions of element 2] x [# of versions of element 3] x … = total # of variants

As you can see, an addition of another element into the mix dramatically increases the number of variants and divides your user traffic even more. This makes is harder and harder to get statistical significance.

When to use multivariate testing vs. A/B testing

The most common reason product teams use multivariate tests rather than A/B tests is when they want to test the interaction effects between elements. If you simply want to test more than one element, you should run multiple A/B tests instead.

For example, if you want to test the color of a buy button and you want to test the payment options available to the customer, those two things probably won’t affect each other. In this case you’re better off running two separate A/B tests because:

  1. A/B testing require fewer variants and therefore fewer users to reach statistical significance. If you have 3 different colors you want to test and 3 different payment option versions, you only need 6 total variants (one for each color and one for each payment option version) with A/B testing. With multivariate testing, you’ll need 3×3 = 9 total variants. This means that if you have 10,000 users per day that reach your website, you’ll be able to get statistically significant results much faster with A/B testing.
  2. Interaction effects can be complicated and misleading. As with any A/B test, you must have a hypothesis first or else it’s easy to find statistical significance where there really is none. If there is no logical reason for why two elements should affect each other, you should not run a multivariate test.

A simple case study of multivariate testing

As a simple example of this process in action, let’s take our testing case study of German app Clever Lotto, a popular lottery game in Europe. The Clever Lotto team wanted to increase the number of rounds their users played. In short, they wanted their users to stay online longer. With just the tap of a button, players could advance to the next drawing—but could a design change of the button lead to more people continuing to play?

The team ran a multivariate test on two font colors (red and green) and two calls to action (the gentle “Play again now” and the more intense “PLAY AGAIN!”). The hypothesis: red letters would be the game changer… The results? Users didn’t prefer green over red, the intensity of the wording won out overall: PLAY AGAIN! We found that green had a 72.40% chance of causing a 5.95% lift over red, while “PLAY AGAIN!” had a 99.85% chance of causing a 25.38% lift over “Play again now.” This meant that the color did not have a statistically significant impact, while the copy change absolutely did.

Best practices around multivariate testing

Make sure you have enough traffic and time

Since multivariate tests split users into many more groups than single variate A/B testing, you’ll need to have a large volume of users to even justify the process. Sample sizes and run times have a significant impact on results from any kind of testing. The larger the sample size taken from your user pool, the more accurate the results will be. If your variants are small and do not yield a big difference in user behavior, you will require a larger amount of traffic for statistically significant results.

Choose elements wisely

Multivariate testing requires detailed preparation in order to get the most meaningful results possible. As stated above, do not use multivariate testing if there is no logical reason why 2 or more elements should affect each other.

Confirm your software’s accuracy

It can also be a good idea to run a classic A/A test to confirm that the software and your product are testing accurately. If all goes well, an A/A test will come back inconclusive—because both variations are the same—and you can move on to more interesting experiments.

Determine what success means for you

Success is often unique to each team, product, and test. However, it’s imperative that you have a hypothesis of how the multivariate test will affect the metric that you’re measuring before you began. Multivariate testing should not be used just to see what happens.

Mostly, we see teams test user acquisition, conversion rates, retention/engagement, and monetization.

Know how long to run multivariate tests

Another factor that influences the results of multivariate testing is how long they run, as well as the size of the sample that will be analyzed. There are dependable formulas that testers can use for multivariate testing that can give relatively precise estimates of the sample size that would be needed, which would also determine how long those tests need to be run. That is the best guideline to achieve as precise results as possible based on multivariate testing.