Understanding Results
This guide explains how to read experiment results and avoid common mistakes.
Key terms (in plain language)
- Exposure: A visitor was assigned to a variant.
- Control: Your baseline version.
- Variant: A version you’re comparing against control.
- Lift vs control: How much better or worse a variant is compared to control.
- Confidence: How likely it is that the difference you’re seeing is real (not just random noise).
How to interpret the results table
You’ll typically see, for each metric:
- A value for each variant
- A comparison vs control (positive or negative)
- A confidence indicator
Practical guidance:
- Early results are noisy. Wait until you have a meaningful number of exposures.
- Prefer consistent improvements over “spiky” wins that disappear as the sample grows.
- Focus on your primary metric first, then confirm secondary metrics and guardrails.
“Statistical significance” (what it means for you)
You don’t need to do statistics to use experiments well.
In practice, “statistically significant” means:
- The system has enough data to be confident the difference is not due to chance.
What to do with it:
- If confidence is low: keep running the experiment.
- If confidence is high and the lift is meaningful: consider shipping the winner.
- If confidence is high and the lift is negative: consider stopping early.
Common pitfalls
Stopping too early
Small samples can produce big swings. Give the experiment time to stabilize.
Testing multiple big changes at once
If you change several things at once, it’s hard to know what caused the impact.
Ignoring guardrails
A win on conversion can hide a loss in quality. Always check guardrails.
Uneven traffic split
If exposures are wildly different than your intended split, check your targeting and routing rules.
Last updated on