Picture of a phone with Shopify software

Start Your Business with Shopify

Try Shopify for free, and explore all the tools and services you need to start, run, and grow your business.

The Complete Guide to A/B Testing: Expert Tips from Google, HubSpot and More

10 Apr 2020
27 minute read
Leave a comment

This probably isn’t the first time you’ve read about A/B testing. You might even already A/B test your email subject lines or your social media posts.

Despite the fact that there’s been plenty said about A/B testing in the field of marketing,a lotof people still get it wrong. The result? People making major business decisions based on inaccurate results from an improper test.

A/B testing often is over simplified, especially in content written for store owners. Below you’ll find everything you need to know to get started with different types of A/B testing for ecommerce, explained as plainly as possible.

Table of Contents

What is A/B testing?
How A/B testing works
What is A/B/n testing?
How long should A/B tests run?
Why should you A/B test?
What should you A/B test?
Prioritizing A/B test ideas
A crash course in A.B testing statistics
How to set up an A/B test
How to analyze A/B test results
How to archive past A/B tests
A/B testing processes of the pros
Optimize A/B testing for your business

What is A/B testing?

A/B testing, sometimes referred to as split testing, is the process of comparing two versions of the same webpage, email, or other digital asset to determine which one performs better.

This process allows you to answer important business questions, helps you generate more revenue fromthe traffic you already have, and sets the foundation for a data-informed marketing strategy.

Learn More:How to Conduct aSWOT Analysisfor Your Business

How A/B testing works

When using A/B testing in the context of marketing, you show 50% of visitors version A of your asset (let’s call this the “control”), and 50% of visitors version B (let’s call this the “variant”).

The version that results in the highest conversion rate wins. For example, let’s say the variant (version B) yielded the highest conversion rate. You would then declare it the winner and push 100% of visitors to the variant.

Then, the variant becomes the new control, and you must design a new variant.

It’s worth mentioning that an A/B test conversion rate is an imperfect measure of success. Why? You can increase your conversion rate instantly by making everything in your store free. Of course, that’s a terrible business decision.

That’s why you should track the value of a conversion all the way through to the sound of a ringing cash register.

Free Reading List: Conversion Optimization for Beginners

Turn more website visitors into customers by getting a crash course in conversion optimization. Access our free, curated list of high-impact articles below.

What’s A/B/n testing?

With A/B/n testing, you can test more than one variant against the control. So, instead of showing 50% of visitors the control and 50% of visitors the variant, you might show 25% of visitors the control, 25% the first variant, 25% the second variant, and 25% the third variant.

Note: This is different from multivariate testing, which also involves multiple variants. When running multivariate tests, you’re not only testing multiple variants, you’re testing multiple elements as well, such as A/B testing UX or SEO split testing. The goal is to figure out which combination performs best.

You’ll needa lotof traffic to run multivariate tests, so you can ignore those for now.

How long should A/B tests run?

Run your A/B test for at least one, ideally two, full business cycles. Don’t stop your test just because you’ve reached significance. You’ll also need to meet your predetermined sample size. Finally, don’t forget to run all tests in full-week increments.

Why two full business cycles? For starters:

You can account for “I need to think about it” buyers.
You can account for all of the different traffic sources (Facebook, email newsletter, organic search, etc.)
You can account for anomalies. For example, your Friday email newsletter.

If you’ve used any sort of A/B or landing page testing tool, you’re likely familiar with the little green “Statistically Significant” icon.

For many, unfortunately, that’s the universal sign for “the test is cooked, call it.” As you’ll learn more about below, just because A/B test statistical significance has been reached does not mean you should stop the test.

And your predetermined sample size? It’s not as intimidating as it seems. Open up a sample size calculator, likethis one from Evan Miller.

这个计算是说如果你的current conversion rate is 5% and you want to be able to detect a 15% effect, you need a sample of 13,533 per variation. So, in total, over 25,000 visitors are needed if it’s a standard A/B test.

Watch what happens if you want to detect a smaller effect:

All that’s changed is the minimum detectable effect (MDE). It’s decreased from 15% to 8%. In this case, you need a sample of 47,127 per variation. So, in total, nearly 100,000 visitors are needed if it’s a standard A/B test.

Whether you’re A/B testing UX or SEO split testing, your sample size should be calculated upfront, before your test starts. Your test can’t stop, even if it reaches significance, until the predetermined sample size is reached. If it does, the test isn’t valid.

This is why you can’t aimlessly follow best practices, like “stop after 100 conversions.”

It’s also important to run tests for full-week increments. Your traffic can change based on the day of the week and the time of day, so you’ll want to be sure to include every day of the week.

Why should you A/B test?

Let’s say you spend $100 onFacebook adsto send 10 people to your site. Your average order value is $25. Eight of those visitors leave without buying anything and the other two spend $25 each. The result? You lost $50.

Now let’s say you spend $100 on Facebook ads to send 10 people to your site. Your average order value is still $25. This time, though, only five of those visitors leave without buying anything and the other five spend $25 each. The result? You made $25.

This is one of the more simple A/B testing examples, of course. But by increasing the conversion rate on-site, you made thesame trafficmore valuable.

A/B testing imagesand copy also helps you uncover insights, whether your test wins or loses. This value is very transferable. For example, a copywriting insight from aproduct descriptionA/B test could help inform your value proposition, aproduct video, or other product descriptions.

You also can’t ignore the inherent value of focusing on continuously improving the effectiveness of your store.

Should you be A/B testing?

Not necessarily. If you’re running a low-traffic site or a web or mobile app, A/B testing is probably not the best optimization effort for you. You will likely see a higher return on investment (ROI) from conducting user testing or talking to your customers, for example.

Despite popular belief,conversion rate optimizationdoes not begin and end with testing.

Consider the numbers from the sample size calculator above. 47,127 visitors per variation to detect an 8% effect if yourbaseline conversionrate is 5%. Let’s say you want to test aproduct page. Does it receive nearly 100,000 visitors in two to four weeks?

Why two to four weeks? Remember, we want to run tests for at least two full business cycles. Usually, that works out to two to four weeks. Now maybe you’re thinking, “No problem, Shanelle, I’ll run the test for longer than two to four weeks to reach the required sample size.” That won’t work, either.

You see, the longer a test is running, the more susceptible it is to external validity threats and sample pollution. For example, visitors might delete their cookies and end up re-entered into the A/B test as a new visitor. Or someone could switch from their mobile phone to desktop and see an alternate variation.

从本质上讲,letting your test run for too long is as bad as not letting it run long enough.

Testing is worth the investment for stores that can meet the required sample size in two to four weeks. Stores that can’t should consider other forms of optimization until their traffic increases.

Julia Starostenko, data scientist at Shopify, agrees, explaining:

Julia Starostenko, Shopify

“Experimenting is fun! But it is important to make sure that the results are accurate.

“Ask yourself: Is your audience large enough? Have you collected enough data? In order to achieve true statistical significance (within a reasonable timeframe) the audience size needs to be large enough.”

What should you A/B test?

I can’t tell you what you should A/B test. I know, I know. It would certainly make your life easier if I could give you a list of 99 things to test right now. There’s no shortage of marketers willing to do that in exchange for the clicks.

Truth is, the only tests worth running are tests based on your own data. I don’t have access to your data, your customers, etc., and neither does anyone curating those huge lists of A/B test ideas. None of us canmeaningfullytell you what to test.

The only tests worth running are tests based on your own data.

Instead, I encourage you to answer this question for yourself through qualitative and quantitative analysis. Some popular A/B testing examples are:

Technical analysis.Does your store load properly and quickly on every browser? On every device? You might have a shiny new iPhone 11, but someone somewhere is still rocking a Motorola Razr from 2005. If your site doesn’t work properly and quickly, it definitely doesn’t convert as well as it could.

On-site surveys.These pop up as your store’s visitors browse around. For example, an on-site survey might ask visitors who have been on the same page for a while if there’s anything holding them back from making a purchase today. If so, what is it? You can use this qualitative data to improve your copy and conversion rate.

Customer interviews.Nothing can replace getting on the phone and talking to your customers. Why did they choose your store over competing stores? What problem were they trying to solve when they arrived on your site? There are a million questions you could ask to get to the heart of who your customers are andwhy theyreallybuy from you.

Customer surveys.Customer surveys are full-length surveys that go out to people who have already made a purchase (as opposed to visitors). When designing a survey, you want to focus on: defining your customers, defining their problems, defining hesitations they had prior to purchasing, and identifying words and phrases they use to describe your store.

Analytics analysis.Are your analytics tools tracking and reporting your data properly? That might sound silly, but you’d be surprised by how many analytics tools are configured incorrectly. Analytics analysis is all about figuring out how your visitors behave. For example, you might focus on the funnel. Where are your biggestconversion funnel leaks? In other words, where are most people dropping out of your funnel? That’s a good place to start testing.

User testing.This is where you watch real people in a paid, controlled experiment try to perform tasks on your site. For example, you might ask them to find a video game in the $40–$60 range and add it to their cart. While they’re performing these tasks, they narrate their thoughts and actions out loud.

Session replays.Session replays are similar to user testing, but now you’re dealing with real people with real money and real intent to buy. You’ll watch as your actual visitors navigate your site. What do they have trouble finding? Where do they get frustrated? Where do they seem confused?

There are additional types of research as well, but start by choosing the best A/B testing methodology for you. If you run through some of them, you will have a huge laundry list of data-informed ideas worth testing. I guarantee your list will bring you more value than any “99 things to test right now” article ever could.

Prioritizing A/B test ideas

A huge list of A/B test ideas is exciting, but not exactly helpful for deciding what to test. Where do you start? That’s where prioritization comes in.

There are a few common prioritization frameworks you can use:

ICE.ICE stands forimpact, confidence, and ease. Each of those factors receives a 1–10 ranking. For example, if you could easily run the test by yourself without help from a developer or designer, you might give ease an eight. You’re using your judgement here, and if you have more than one person running tests, rankings may become too subjective. It helps to have a set of guidelines to keep everyone objective.

PIE.PIE stands forpotential, importance, and ease. Again, each factor receives a 1–10 ranking. For example, if the test will reach 90% of your traffic, you might give importance an eight. PIE is as subjective as ICE, so guidelines can be helpful for this framework as well.

PXL.PXL is the prioritization framework from CXL. It’s a little bit different and more customizable, forcing more objective decisions. Instead of three factors, you’ll find yes/no questions and an ease-of-implementation question. For example, the framework might ask: “Is the test designed to increase motivation?” If yes, it gets a 1. If no, it gets a 0. You can learn more about this framework anddownload the spreadsheet here.

Now you have an idea of where to start, but it can also help to categorize your ideas. For example, during some conversion research I did recently, I used three categories: implement, investigate, and test.

Implement.Just do it. It’s broken or obvious.
Investigate.Requires extra thought to define the problem or narrow in on a solution.
Test.The idea is sound and data informed. Test it!

Between this categorization and prioritization, you’re set.

A crash course in A/B testing statistics

Before you run a test, it’s important to dig into statistics. I know, statistics usually aren’t a fan favorite, but think of this as the required course you begrudging take to graduate.

Statistics is a big part of A/B testing. Fortunately, A/B testing tools and split testing software have made the job of an optimizer easier, but a basic understanding of what’s happening behind the scenes is crucial for analyzing your test results later on.

Alex Birkett, Growth Marketing Manager atHubSpot, explains:

Alex Birkett, HubSpot

“Statistics isn’t a magic number of conversions or a binary ‘Success!’ or ‘Failure ’ thing. It’s a process used to make decisions under uncertainty and to reduce risk by trying to reduce the fogginess on what the outcome of a given decision will be.

“With that in mind, I think it’s most necessary to know the basics: what’s a mean, variance, sampling, standard deviation, regression to the mean, and what constitutes a ‘representative’ sample. In addition, it helps when you’re starting out with A/B testing to set up some specific guardrails to mitigate as much human error as possible.”

What is mean?

Mean is the average. Your goal is to find a mean that is representative of the whole.

For example, let’s say you’re trying to find the average price of video games. You’re not going to add the price of every video game in the world and divide it by the number of all the video games in the world. Instead, you’ll isolate a small sample that isrepresentativeof all of the video games in the world.

You might end up finding the average price of a couple hundred video games. If you’ve selected a representative sample, the mean price of those two hundred video games should be representative of all the video games in the world.

What is sampling?

The larger the sample size, the less variability there will be, which means the mean is more likely to be accurate.

所以,如果你增加你的样本二百video games to two thousand video games, you’d have less variance and a more precise mean.

What is variance?

平均方差的变化。从本质上讲,the higher the variability, the less accurate the mean will be in predicting an individual data point.

So, how close is the mean to the actual price of each individual video game?

What is statistical significance?

Assuming there’s no difference between A and B, how often will you see the effect just by chance?

The lower the statistical significance level, the bigger the chance that your winning variation is not a winner at all.

Simply put, a low significance level means that there is a big chance your “winner” is not a real winner (this is known as a false positive).

Be aware that most A/B testing tools and open source A/B testing software call statistical significance without waiting for a predetermined sample size or point in time to be reached. That’s why you might notice your test flipping back and forth between statistically significant and statistically insignificant.

Peep Laja, founder ofCXL Institute, wants more people toreallyunderstand A/B test statistical significance and why it’s important:

Peep Laja, CXL Institute

“Statistical significance does not equal validity—it’s not a stopping rule. When you reach 95% statistical significance or higher, that means very little before two other more important conditions have been met:

“1. There’s enough sample size, which you figure out using sample size calculators. Meaning, enough people have been part of the experiment so that we can conclude anything at all.

“2. The test has run long enough so the sample is representative (and not too long to avoidsample pollution). In most cases you’ll want to run your tests two, three, or four weeks, depending on how fast can you get the needed sample.”

What is regression to the mean?

You might notice extreme fluctuations at the beginning of your A/B test.

Regression to the mean is the phenomenon that says if something is extreme on its first measurement, it will likely be closer to the average on its second measurement.

If the only reason you’re calling a test is because it’s reached statistical significance, you could be seeing a false positive. Your winning variation will likely regress to the mean over time.

What is statistical power?

Assuming there’s a difference between A and B, how often will you see the effect?

The lower the power level, the bigger the chance that a winner will go unrecognized. The higher the power level, the lower the chance that a winner will go unrecognized. Really, all you’ll need to know is that 80% statistical power is standard for most A/B testing tools and/or any split-testing service.

Ton Wesseling, founder ofOnline Dialogue, wishes more people knew about statistical power:

Ton Wesseling, Online Dialogue

“Lots of people worry about false positives. We worry way more about false negatives. Why run experiments where the chances of finding proof that your positive change has an impact is really low?”

What are external validity threats?

There are external factors that threaten the validity of your tests. For example:

Black Friday Cyber Monday (BFCM) sales
A positive or negative press mention
A major paid campaign launch
The day of the week
The changing seasons

One of the more common A/B testing examples where external validity threats impacts your results is during seasonal events. Say you were to run a test during December. Major shopping holidays would mean an increase in traffic for your store during that month. You might find in January that your December winner is no longer performing well.

Why?

Because of an external validity threat: the holidays.

The data you based your test decision on was an anomaly. When things settle down in January, you might be surprised to find your winner losing.

You can’t eliminate external validity threats, but you can mitigate them by running tests for full weeks (e.g., don’t start a test on a Monday and end it on a Friday), including different types of traffic (e.g., don’t test paid traffic exclusively and then roll out the results to every traffic source), and being mindful of potential threats.

If you happen to be running a test during a busy shopping season, like BFCM, or through a major external validity threat, you might want to read ourComplete Guide to A/B Testing.

How to set up an A/B test

Let’s walk through a little A/B testing tutorial. Before you testanything, you need to have a solid hypothesis. (Great, we just finished math class and now we’re on to science.)

Don’t worry, it’s not complicated. Basically, you need to test a hypothesis, not an idea. A hypothesis is measurable, aspires to solve a specific conversion problem, and focuses on insights instead of wins.

You need to A/B test a hypothesis, not an idea.

Whenever I’m writing an hypothesis, I use a formula borrowed fromCraig Sullivan’s Hypothesis Kit:

Because you see[insert data/feedback from research]
You expect that [change you’re testing] will cause [impact you anticipate] and
You’ll measure this using [data metric]

Easy, right? All you have to do is fill in the blanks and your test idea has transformed into a hypothesis.

Choosing an A/B testing tool

Now you can start choosing an A/B testing tool or split testing service. More often than not, you’ll think ofGoogle Optimize,Optimizely, andVWOfirst.

All are good, safe options.

Google Optimize.Free, save for some multivariate limitations, which shouldn’t really impact you if you’re just getting started. It works well when performing Google Analytics A/B testing, which is a plus.

Optimizely.容易小测试启动并运行,甚至无ut technical skills.Stats Enginemakes it easier to analyze test results. Typically, Optimizely is the most expensive option of the three.

VWO.VWO hasSmartStatsto make analysis easier. Plus, it has a great WYSIWYG editor for beginners. Every VWO plan comes with heatmaps, on-site surveys, form analytics, etc.

We also have some A/B testing tools in theShopify App Storethat you might find helpful.

Once you’ve selected an A/B testing tool or split-testing software, fill out the sign-up form and follow the instructions provided. The process varies from tool to tool. Typically, though, you’ll be asked to install a snippet on your site and set goals.

How to analyze A/B test results

Remember when I said writing a hypothesis shifts the focus from wins to insights? Krista Seiden, Analytics Advocate and Product Manager atGoogle, explains what that means:

Krista Seiden, Google

"The most overlooked aspect of A/B testing is learning from your losers. In fact, in the optimization programs I’ve run, I make a habit of publishing a ‘failures report’ where I call out some of the biggest losers of the quarter and what we learned from them.

“One of my all time favorites was from a campaign that was months in the making. We were able to sneak in landing page testing just before it was set to go live, and it’s a good thing we did, because it failed miserably. Had we actually launched the page as it was, we would have taken a significant hit to the bottom line. Not only did we end up saving the business a ton of money, but we were able to dig in and make some assumptions (that we later tested) about why the new page had performed so poorly, and that made us better marketers and more successful in future campaigns.”

If you craft your hypothesis correctly, even a loser is a winner, because you’ll gain insights you can use for future tests and in other areas of your business. So, when you’re analyzing your test results, you need to focus on the insights, not whether the test won or lost. There’s always something to learn, always something to analyze. Don’t dismiss the losers!

If you craft your hypothesis correctly, even a loser is a winner.

The most important thing to note here is the need for segmentation. A test might be a loser overall, but chances are it performed well with at least one segment. What do I mean by segment?

New visitors
Returning visitors
iOS visitors
Android visitors
Chrome visitors
Safari visitors
Desktop visitors
Tablet visitors
Organic search visitors
Paid visitors
Social media visitors
Logged-in buyers

You get the idea, right?

When you’re looking at the results in your testing tool, you’re looking at the whole box of candies. What you need to do is separate the Smarties by color so you can eat the red ones last. I mean, so you can uncover deeper, segmented insights.

Odds are that the hypothesis was proven right among certain segments. That tells you something as well.

Analysis is about so much more than whether the test was a winner or a loser. Segment your data to find hidden insights below the surface.

A/B testing tools won’t do the analysis for you, so this is an important skill to develop over time.

免费的电子书:电子商务分析对于初学者来说

Find out which metrics are the key to establishing and growing your online business. This free guide is the perfect first step in learning about ecommerce analytics.

How to archive past A/B tests

Let’s say you run your first test tomorrow. Two years from tomorrow, will you remember the details of that test? Not likely.

That’s why archiving your A/B testing results is important. Without a well-maintained archive, all those insights you’re gaining will be lost. Plus, I kid you not, it’s very easy to test the same thing twice if you’re not archiving.

There’s no “right” way to do this, though. You could use a tool likeProjectsorEffective Experiments, or you could use Excel. It’s really up to you, especially when you’re just getting started. Just make sure you’re keeping track of:

The hypothesis
Screenshots of the control and variation
Whether it won or lost
Insights gained through analysis

As you grow, you’ll thank yourself for keeping this archive. Not only will it help you, but new hires and advisers/stakeholders as well.

A/B testing processes of the pros

Now that you’ve been through a standard A/B testing tutorial, let’s take a look at the exact processes of pros from companies like Google and HubSpot.

Free Reading List: Conversion Optimization for Beginners

Turn more website visitors into customers by getting a crash course in conversion optimization. Access our free, curated list of high-impact articles below.

Krista Seiden, Google

My step-by-step process for web and app A/B testing starts with analysis—in my opinion, this is the core of any good testing program. In the analysis stage, the goal is to examine your analytics data, survey or UX data, or any other sources of customer insight you might have in order to understand where your opportunities for optimization are.

Once you have a good pipeline of ideas from the analysis stage, you can move on to hypothesize what might be going wrong and how you could potentially fix or improve these areas of optimization.

Next, it’s time to build and run your tests. Be sure to run them for a reasonable amount of time (I default to two weeks to ensure I’m accounting for week over week changes or anomalies), and when you have enough data, analyze your results to determine your winner.

It’s also important to take some time in this stage to analyze the losers as well—what can you learn from these variations?

Finally, and you may only reach this stage once you’ve spent time laying the groundwork for a solid optimization program, it’s time to look into personalization. This doesn’t necessarily require a fancy toolset but rather can come out of the data you have about your users.

Marketing personalizationcan be as easy as targeting the right content to the right locations or as complex as targeting based on individual user actions. Don’t jump in all at once on the personalization bit though. Be sure you spend enough time to get the basics right first.

Alex Birkett, HubSpot

At a high level, I try to follow this process:

Collect data and make sure analytics implementations are accurate.
Analyze data and find insights.
Turn insights into hypotheses.
Prioritize based on impact and ease, and maximize allocation of resources (especially technical resources).
Run a test (following statistics best practices to the best of my knowledge and ability).
Analyze results and implement or not according to the results.
Iterate based on findings, and repeat.

Put more simply: research, test, analyze, repeat.

While this process can deviate or change based on what the context is (Am I testing a business-critical product feature? A blog post CTA? What’s the risk profile and balance of innovation vs. risk mitigation?), it’s pretty applicable to any size or type of company.

The point is this process is agile, but it also collects enough data, bothqualitative customer feedbackand quantitative analytics, to be able to come up with better test ideas and better prioritize them so you candrive traffic to your online store.

Ton Wesseling, Online Dialogue

The first question we always answer when we want to optimize a customer journey is: Where does this product or service fit on the ROAR model we created at Online Dialogue? Are you still in the risk phase where we could do lots of research but can’t validate our findings through A/B test online experiments (below 1,000 conversions per month), or are you in the optimization phase? Or even above?

Risk phase: lots of research, which will be translated into anything from a business model pivot to a whole new design and value proposition.
Optimization phase: large experiments that will optimize the value proposition and the business model.
Optimization phase: small experiments to validate user behavior hypotheses, which will build up knowledge for larger design changes.
Automation: you still have experimentation power (visitors) left, meaning your full test potential is not needed to validate your user journey. What’s left should be used to exploit, to grow faster now (without focus on long-term learnings). This could be automated by running bandits/using algorithms.
Re-think: you stop adding lots of research, unless it’s a pivot to something new.

So web or app A/B testing is only a big thing in the optimization phase of ROAR and beyond (until re-think).

Our approach to running experiments is the FACT & ACT model:

The research we do is based on our 5V Model:

我们收集所有这些见解麦n research-backed hypothesis, which will lead to sub-hypotheses that will be prioritized based on the data gathered through either desktop or mobile A/B testing. The higher the chance of the hypothesis being true, the higher it will be ranked.

Once we learn if our hypothesis is true or false, we can start combining learnings and take bigger steps by redesigning/realigning larger parts of the customer journey. However, at some point, all winning implementations will lead to a local maximum. Then you need to take a bigger step to be able to reach a potential global maximum.

And, of course, the main learnings will be spread throughout the company, which leads to all sorts of broader optimization and innovation based on your validated first-party insights.

Are you marketing to an international audience? Learn how to make that process easy withpseudo-localization.

Julia Starostenko, Shopify

The purpose of an experiment is to validate that making changes to an existing webpage will have a positive impact to the business.

Before getting started, it’s important to determine if running an experiment is truly necessary. Consider the following scenario: there is a button with an extremely low click rate. It would be near impossible to decrease the performance of this button. Validating the effectiveness of a proposed change to the button (i.e., running an experiment) is therefore not necessary.

Similarly, if the proposed change to the button is small, it probably isn't worth spending the time setting up, executing, and tearing down an experiment. In this case, the changes should just be rolled out to everyone and performance of the button can be monitored.

If it is determined that running an experiment would in fact be beneficial, the next step is to define the business metrics that should be improved (e.g., increase the conversion rate of a button). Then we ensure that proper data collection is in place.

Once this is complete, the audience is randomly run split testing between two groups; one group is shown the existing version of the button while the other group gets the new version. The conversion rate of each audience is monitored, and once statistical significance is reached, the results of the experiment are determined.

Peep Laja, CXL Institute

A/B testing is a part of a bigger conversion optimization picture. In my opinion it’s 80% about the research and only 20% about testing. Conversion research will help you determine what to test to begin with.

My process typically looks like this (a simplified summary):

Conduct conversion research using a framework likeResearchXLto identify issues on your site.
Pick a high priority issue (one that affects a large portion of users and is a severe issue), and brainstorm as many solutions to this problem as you can. Inform your ideation process with your conversion research insights. Determine which device you want to run the test on (you need to run mobile A/B testing separate from desktop).
Determine how many variations you can test (based on your traffic/transaction level), and then pick your best one to two ideas for a solution to test against control.
Wireframe the exact treatments (write the copy, make the design changes, etc.) Depending on the scope of changes you might also need to include a designer to design new elements.
Have your front-end developer implement the treatments in your testing tool. Set up necessary integrations (Google Analytics), set appropriate goals.
进行QA测试(测试到目前为止the biggest A/B testing killer) to make sure it works with every browser/device combo.
Launch the test!
Once thetest is done, conduct post-test analysis.
Depending on the outcome either implement the winner, iterate on the treatments, or go and test something else.

Free Webinar:

Marketing 101

Struggling to grow sales? Learn how to go from first day to first sale in this free training course.

Optimize A/B testing for your business

You have the process, you have the power! So, get out there, get the best A/B testing software, and start testing your store. Before you know it, those insights will add up to more money in the Bank of You.

If you want to continue learning about optimization, consider taking a free course, such asUdacity’s A/B testing by Google. You can learn more about web and mobile app A/B testing to boost your optimization skill set.

About the author

Shanelle Mullin

Shanelle Mullin works on growth, experimentation, and conversion rate optimization at Shopify.

Topics:

Conversion

Start Your Business with Shopify

Read More

The Complete Guide to A/B Testing: Expert Tips from Google, HubSpot and More

What is A/B testing?

How A/B testing works

Free Reading List: Conversion Optimization for Beginners

What’s A/B/n testing?

How long should A/B tests run?