Why You Don't A/B Test... And How You Can Start This August

Hiya guys!

Patrick (patio11) here. You're getting this email from me because you asked for my occasional thoughts on making and selling software.

[Edit: Actually, it's possible that you've never gotten an email from me. Somebody might have just given you the link to this page, which is an online archive of an email that I sent to folks who had asked for it. If you'd like to get articles like this in your inbox, totally free, about once a week or two, give me your email address.]

The single highest ROI tactic which I know of for software companies is conversion optimization. Unfortunately, the overwhelming majority of you aren't doing it.

Brief administrivia: I mentioned in an earlier mail that I'd be writing a shedload this month about conversion optimization. You guys split about evenly between "Yay!" and "I'd appreciate more of a mix of topics." These emails will be on the normal eclectic mix of topics -- today A/B testing, something totally different next week, etc. If you want to get, *in addition* to the usual Friday email, about two mails a week this month about practical conversion optimization for software companies, click here and I'll send them to you separately. No obligation, totally free, cancel any time. More details at the bottom of this email if you need them.

I don't say that "You're probably not A/B testing" with malice. With the exception of vanishingly few firms operating at high levels of sophistication (Google, Facebook, Zynga, etc), the software industry mostly pays lip service to A/B testing. No, really. I know you think that everyone does A/B testing because you've read about it on the Internet for years. I thought that, too. Would it surprise you that Bingo Card Creator has run more tests than most B2B software firms with $50 million a year in revenue? No, really.

I used to consult at companies with $10 to $50 a million year in revenue. The median total number of A/B tests run in the firms' history was zero. I talk to probably a thousand software companies a year, online, on the conference circuit, and for work, and it's true up and down the industry. I'll even tell you a secret: even among firms which have a well-earned reputation for testing, in any given week, it is more likely than not they are not actually running an A/B test.

Your Excuses For Why You Aren't Testing

I've heard every possible genre of excuse for why software companies are not A/B testing. Most are baseless. Do any of the following sound familiar?

"We Don't Have Enough Traffic"

If you're pre-product or a very, very young firm, lack of traffic is actually a valid reason for not A/B testing. If that's you, A/B testing (and the scads of writing I'm doing about it in August) is mostly a distraction from solving your more pressing problem that you have no product or no one knows who you are. Work on that instead.

However: most software product companies which have are big enough to cover at least one person's salary can A/B test meaningful changes.

Here, let me teach you three rules of thumb:

Testing free trial signup or email submission?: You need 3,000 visits to detect a small change in your conversion rate.
Testing a usability improvement within the app?: You need about 1,500 users to detect improvements within your core workflow (good news: you'll notice almost any level of improvement).
Testing conversion to paid use from a free trial?: You need about 2,000 free trials to detect an epic win in conversion rate, or 5,000 free trials to detect a strong win. (The bigger the impact, the easier it is to detect.) Drop a digit off of both these numbers if you require a credit card upfront.

The pace at which you hit those numbers determines the pace you can test at. For example, if you've got 3,000 visits a month and a free trial, you can test improvements to the free trial signup roughly monthly. If you hit that number in a week, then weekly.

Don't know if you have those numbers? You probably do: Bingo Card Creator, one of the world's nichiest hobby projects, had sufficient volumes to do A/B tests every week even six years ago. That's where I started.

Curious as to where I pulled these magic numbers from? Well, let's start into a brief detour of the mechanics of 2-tailed z-tests and... wait, what the heck am I saying. Every time I have that conversation people's eyes roll into the back of their head before I get to "... and that's how you can be certain this will make you a million dollars."

I'll be happy to do a deep dive into the math for the stats geeks in the audience later this August, or you can read Ben Tilly's presentation on A/B testing math. That presentation is worth an undergraduate degree in Statistics. (Note: I do not suggest it for light reading. I'm dead serious: if that presentation doesn't take you days of challenging work to get through you're doing it wrong. About once a year I block off a day to go through a few chapters of it and end that day smarter... and mentally exhausted.)

But back to your takeaway: if you haven't started A/B testing because you think you don't have the traffic, start farther up the funnel at your more common conversions, such as free trial signups or signups to your mailing list, and punt on testing in-app behavior/sales pages/etc until later.

"That 'Math' Nonsense Isn't A Substitute For Vision"

I hope nobody at your company says this, but maybe you've heard someone say it, so let me put your mind at ease.

An anecdote from a fruit company, about A/B testing: "A/B testing? We don't do A/B testing. We do S/J testing. If S/J agrees with it, it passes the test."

You are not S/J. (S/J didn't always make great decisions, either, but the great thing about convincing the entire world to buy your hardware is that if you can pull that off you can pay for lots of bad decisions.)

A lot of artists -- and software companies are often founded and staffed by people who have artistic inclinations in their heart of hearts -- feel threatened by A/B testing, because they have the perception that it replaces that vision thing with boring, automatic, mechanical systemization. They react to this by saying "Bah, you can't expect A/B testing to make really consequential decisions" or "Well sure, you could A/B test, but that will only find a local maxima. Why get better at climbing hills when the true measure of an entrepreneur is making sure they're on the right mountain?"

These rationalizations are psychological defense mechanisms against a fear which is, like many, absolutely groundless. Conversion optimization doesn't take the art out of building and selling software any more than calculators take the keen architectural insight out of building bridges -- both just systematically prevent lethal accidents.

I once worked with a 10 year old company: dozens of employees, several great products, millions in revenue. They had the same problems that every other great company has. One time, they grew a bit too fast, the cash flow got tight, and the CEO worried that he was going to have to lay off people. They fixed the problem by working themselves to the bone and pulling through by sheer grit. It's the stuff that entrepreneurial mythmaking is made of.

Except... they didn't have to do that. Why not? Their only problem was that their software -- which took visionary product chops and strong engineering talent to build -- wasn't selling enough. One reason why it wasn't selling enough was one page on the website had a particular decision made back in 2004. That decision was based on a quick guess. Nobody remembers who made the guess, and it doesn't really matter. The point is that it was made, the company went back to doing the things it does best, and the decision was not revisited for almost a decade.

The guess was wrong. Catastrophically wrong.

I could tell you what it was but, well, client confidentiality. I'll say this much: you'd be flabbergasted if I told you what it was. I was flabbergasted and I do conversion optimization for a living. It contravenes our intuitions about the universe being a just place to have such a great thing nearly brought low by such a trivial problem.

Anyhow, the difference between getting that element of the site right and getting it less-than-right worked out to millions of dollars a year. The CEO never said "Well, I could spend two hours working on one web page, but that sounds difficult, so instead I think I'll fire some of my closest friends when we fail to hit payroll in 2 weeks." The accountant never pulled the fire alarm because standard accounting doesn't really have a word for "Losses incurred by inability to choose the most optimal course of action." It was just an accident.

It was a preventable accident. The element at issue would have been -- scratch that, was -- one of the first things any A/B testing practitioner would start with. The test which made millions took less time to write than this email. Too bad it wasn't done in 2004.

"But Patrick, something that small couldn't possibly have mattered that much." I used to listen to this complaint, but honestly, if you're not convinced yet that math is a very accurate way to measure which of two numbers is bigger than the other, we've got very little to talk about.

Best Buy made $300 million in annual revenue by removing a single button and adding two sentences of copy.
37signals, among many other software companies, have shared many A/B tests publicly where the quoted improvements must quickly hit the millions range. (My personal favorite test from them.)
Bingo Card Creator did a 60% yearly revenue increase in 2012 on the strength of a brief series of A/B tests. That was the only thing I changed about BCC in 2012.

At this point, "A/B testing works" is like "Smoking causes lung cancer." The evidence for the conclusion is absolutely overwhelming. I know artists. I love artists. Don't start smoking to be more artistic -- and don't avoid A/B testing because it would threaten your artistic identity.

"We Don't Have Enough Time"

Horsepuckey. Graphical tools like Visual Website Optimizer or Optimizely make many genres of A/B tests take 5 to 10 minutes to set up. I've sat in many meetings where extraordinarily consequential decisions, such as the main headline on the home page, were debated for hours when they could have been tested in minutes.

You don't need to buy one of the graphical tools, by the way. In the dark ages of 2009, there were pretty much no good OSS A/B testing frameworks. Most companies doing it either used one of a few Big Freaking Enterprise packages ("Talk to our sales team for pricing, but I hope you have $500,000"), rolled proprietary tools internally, or used Google Website Optimizer, which was (at the time) a terrible product. (Long story, ask me some other time.)

I made it my personal mission to change that, so I created an OSS A/B testing library for Ruby on Rails (A/Bingo). This knocked A/B testing for me down from "a considerable pain in the butt" to "as easy as adding two lines of code."

In a triumph of the OSS model, people have made over a dozen frameworks, covering almost every commonly used web programming stack, directly inspired by A/Bingo or its design. (My proudest professional accomplishment is inspiring the design of GAE/Bingo, developed at Khan Academy. Their use of it for empirical tests of pedagogical/UI effectiveness have taught more things to more students than I could have if I had worked as a teacher for a hundred lifetimes. And all for free, too.)

Now, in 2013, great options exist for both your engineering/product teams and your marketing team to do A/B testing. They take mere minutes to set up and plug straight into your normal workflows for creating your site/app. You're just not using them because...

"We Don't Know What To Test"

And now we come to the heart of the matter. There are thousands of blog posts on the Internet which tell you that your call to action buttons should be ~~red~~ orange, that "Create Trial" outperforms "Submit", that your credit card form should have a green lock icon, that...

It's a cacophony of opinions, a pandemonium of test results, and we suffer from analysis paralysis and default to doing what we know how to do, which is "nothing." (Why test when you're in doubt when you can do nothing very confidently.)

I have been advocating for A/B tests for years, but advocating and toolwriting isn't enough, so I've blocked off August to work on systematically killing the reasons the software industry doesn't test.

I'm implementing long-deferred tests for Appointment Reminder, the business which is my priority since quitting consulting. I'm working with five other software firms on their testing, in the open. (I'm Internet buddies with a lot of software firms, so I got in touch with a few which were big enough to be interesting but small enough to not have Legal go spare at the notion of writing about their conversion rates.) And I'm going to be writing a truckload, because it is a more productive use of my time than playing League of Legends while waiting to get results back.

When I asked about your guys' interest level in A/B testing a bit ago, many of you were as interested in it as I am, but some of you were more lukewarm. I want to avoid wasting anyone's time, so if you burn with desire to finally start A/B testing, you should:

Click this link to get a mini-course on it.
Give me your email address.
Click the confirm link I'll mail to you.

The mini-course is about two mails a week continuing through about the end of August, with deep dives into A/B testing. (This is in addition to the normal Friday emails, which will cover the usual eclectic mix of topics.)

They will include:

Concrete suggestions on what to test for the home page, pricing page, landing page, and in-app experiences
How to identify what your company's particular weaknesses with conversions are and start addressing them
How to actually use A/B testing in real companies -- you know, ones which have busy employees, uncertain priorities, politics, legacy systems, and all the other things that never seem to happen in writeups on the Internet
Case studies from my company and the firms I'm working with: what was tested, why that was tested, why that alternative out of the universe of possible ones, and (where possible) how it worked out

Is there a catch? There has to be a catch.

The mini-course is totally free, cancel at any time, no obligation, yadda yadda yadda.

Why? Googlers get 20% time to tinker on new projects, I get 20% time to teach. (It's a weird hobby, but hey, I used to play World of Warcraft, so it is an improvement.) I decided to try going for depth over breadth for a change.

Is there another announcement?

Yep! In addition to working on A/B tests and writing about them, I'm also shooting a video course on it. About 200 of you took my course last year on lifecycle email marketing. This course will be similar in character, on A/B testing and conversion optimization for software companies.

I'm tentatively planning on launching it late in August. Video makes Waterfall software development look like bugs-in-your-teeth speed, though, so it is still too early to promise anything. It will be ready when it is ready. I'll keep you guys occasionally informed.

As always, I appreciate your comments. You can hit reply -- I read almost everything and respond to most of it.

Until next time.

Regards,

Patrick McKenzie

P.S. If for some reason you managed to read this far and want to get the free mini-course on A/B testing but didn't click on the link yet, let me save you the scroll back up: here you go.