Tuesday, June 15, 2021

A/B testing considered BS

On our launch post we sensationally stated that A/B testing is “bullshit”. We mean that: it's waste of time and promotes bad design. It's essentially a system for marketers to look smart.

We claim that:

  1. A/B testing wastes your time
  2. A/B testing blocks development and growth
  3. A/B testing promotes bad design

Wasted time

Our first claim is that A/B testing wastes your time and money. There are two distinct reasons:

1. Absurdly long waiting times

Let's take an example website with a decent amount of traffic (10,000 monthly visitors) and a conversion rate of 2% (new signups). Then we set up an A/B test to see whether an update on the front page (like a new background color) would make a positive impact on the conversion rate.

To get statistically significant results we want to collect at least 400 conversions per variation before drawing any conclusions, which seems to be a recommended ballpark number to aim for *.

A quick math 2 * 400 / (10000 * 0.02) reveals that it takes 120 days to run the test before you can confidently draw any conclusions.

Four months is a long time.

2. Micro-experiments only

The WYSIWYG optimization tools are only designed to test small things like alternate colors, titles, or images. You cannot experiment with the stuff that matters like global design updates, brand renewals, product launches, or any major content updates spanning multiple pages.

Here's how A/B testing works:

Setting up a button test in Google Optimize

Some element on a page is given an alternate design and half of your visitors are exposed to it. The other half is served with the original button. Then the results are compared:

A typical decision-making process with A/B testing

Things seem to look great right after the test is started. The new button seems to win by a mile! Slowly, but steadily, the curves start to settle and you'll begin to see the winner. At some point (too early), you make the decision.

If you had the patience to wait for four months, you would see the obvious: small changes make little or no impact. It was all just a waste of time. Not just days, but months.

Of course, nobody waits that long sot the test was all 💩.

Blocked development

With A/B testing you are not supposed to make any changes on the website while a test is running or the result is distorted. You must be sure that the outcome is fully attributed to the test variant and no other variables are in play.

If your company uses an iterative A/B testing cycle, where a new test is always started after the last one finishes (aka. “always be testing”), you end up in a situation where the developers must stop working to make sure that the A/B tests are working.

Micro-edits are preferred over the more important stuff like performance optimizations and global design updates.

Luckily things are not this bad in the real world, because people don't care about statistical significance and they continue developing the site normally — causing the test results to go 100% random. However, doing it wrong allows you to move faster.

Treating A/B tests as bullshit is a better way forward.

Bad design

When the A/B testing software is dictating the design you can only work on things that the testing software is capable of: alternate colors, headlines, and images.

The most popular experiment is indeed the button test. Take your primary call-to-action button on the front page and spice it up: use a shiny background color (typically orange) and make it bigger.

And when the test is over, start a new expriment. Maybe tweak the buttons on the pricing page? Because you should always be experimenting.

Eventually, this leads to a terrible overall design.

Remember Dieter Rams and the Principles of good design? Good design is innovative, understandable, honest, unobtrusive, long-lasting, and thorough. That's contrary to what A/B testing is all about.

Couple that with the fact that by using Google Tag Manager marketers are allowed to insert all sorts of optimization tools on your site, making your pages fatter and slower. Optimizely alone is a whopping 800Kb of JavaScript. Developers are fully aware of this “website obesity crisis” where the page weight is reliably increasing every year.

And with these optimization scripts in place, your visitors begin to see annoying “FOOC effects”, where the original headline is briefly shown before the B- alternative replaces it. Visitors feel they are just a sample to be processed.

A/B testing is an effective way to ruin the user experience with bad design choices. Designers and developers have less motivation to work in this kind of environment where ambition is replaced with cheap sales tactics.

Conclusion

Traditional experimentation and A/B testing are harmful. It wastes your time and promotes bad design ultimately lowering your conversion rate.

We recommend the following instead:

  1. Kindly ask marketers to stop using a visual A/B testing tool
  2. Let them focus on user acquisition, content, and marketing
  3. Let designers focus on usability and design
  4. Let developers focus on performance and development

Maybe use something more clever for A/B testing.

{"og_image":"/blog/img/bullshit/og.png","short_desc":"We mean that","desc":"It's a waste of time and it promotes bad design.","date":"2021-06-15T09:00","draft":true,"title":"A/B testing considered BS","url":"/blog/ab-testing-is-bullshit","key":"ab-testing-is-bullshit","created":"2021-08-20T05:29:59.247Z","modified":"2021-08-25T09:47:30.927Z","createdISO":"2021-08-20","modifiedISO":"2021-08-25"}