Problems in traditional A/B testing

The first-generation testing tools like Optimizely and VWO are suitable for micro-optimizing high-traffic sites, but not ideal for the bigger picture.

Guessworking

Traditional A/B testing starts with an off-the-cuff hypothesis: “By using a bigger call-to-action button here, we should get more leads.“ However, the style or the copy of your buttons is hardly your biggest concern. While doing your experiment, your site is probably leaking tons of people that could be captured with some simple fix. You should really be fixing your bottlenecks — that’s where you make the biggest wins with the lowest possible efforts.

Many businesses would be better off if they didn’t run A/B tests at all.

David Kadavy

Author of “Design for Hackers”

Micro-optimizations

Traditional A/B testing is relatively trivial: you are encouraged to make small changes on a single page, and see how they impact one metric on the current time range. However, there is so much more you can do to your growth by optimizing the bigger picture: your entire website and its market.

Results take too long

In traditional A/B testing, the results take too long to develop.

It’s not unusual to wait for several months before you can trust the results. A rule of thumb is to collect 250-400 conversions per variation before the results are statistically significant.

For example, let’s assume you have a reasonably popular site with 10,000 monthly visitors, and your lead conversion rate is 3%. That generates a charming 10 leads every day (1). Then we start the above A/B test with a more prominent call-to-action button: A is the original button, B is the bigger one.

With 300 conversions per variant, you must wait for 60 days before you can be sure which button was better during the experiment. (2)

(1) 10,000 / 30 × 0.03 = 10

(2) 2 × 300 / 10 = 60

Blocked progress

Perhaps the biggest problem in traditional A/B testing is that you are not expected to make any updates on the site while the experiment is running. You must make sure that the button is making the impact and not anything else. That is: you must stop working on the site while waiting for the results. This can take a couple of months.

And every time you run a test, you are likely leaking tons of visitors somewhere on the top of your funnel. You could have fixed those and made a bunch of other optimizations while waiting for the button experiment to end.

That is not acceptable for a startup because you want to be continually improving and “pivoting.“ You want to make big content- and functional changes to your website and you want to find new and better audiences that generate more traction.

Wasted data

When a traditional A/B test is over, all the experimentation data is no longer usable. You only know which variant performed better in the past couple of weeks.

Instead of just learning one thing, you could have collected new insights about your landing pages, referring domains and visitors, and then reuse this information on the next experiment.

Unfortunately, your next test starts again with a “heuristic analysis” or based on some external data.

User experience damage

Traditional A/B testing software causes following distractions for your visitors just by having it included on your pages:

Slower page load time. For example, Optimizely weights a whopping 796K, which is literally 200 times bigger than Volument. Despite all the superintelligence included in the JavaScript file, Optimizely slows down your pages, and your visitors have a less convincing first impression.
Flash of original content. Visual A/B testing tools can cause an off-putting FOOC effect, where the original content is briefly displayed before the alternate (B variant) kicks in. This makes a flaky experience, and savvy people know they are just part of a marketing experiment.

The UX damage caused by the various analytics tools

A bad first impression makes people want to leave the site before they even bother to find out what the product is.