How do I detect Sample Ratio Mismatch?

Use a chi-squared test to compare observed vs expected sample sizes. If the p-value is below 0.001, you have SRM. Most A/B testing tools automatically detect this, but you can also calculate it manually with SQL.

What causes Sample Ratio Mismatch?

Common causes include: bot traffic affecting one variant, JavaScript errors preventing tracking, redirect issues, caching problems, variant assignment bugs, and browser compatibility issues.

What is Sample Ratio Mismatch (SRM)? Detection & Prevention

Q: What is Sample Ratio Mismatch (SRM)?

Sample Ratio Mismatch (SRM) occurs when the actual traffic split in an A/B test differs significantly from the expected split. For example, if you set a 50/50 split but observe 55/45, you likely have SRM. This indicates a bug in your experiment setup and invalidates your results.

TL;DR

Sample Ratio Mismatch (SRM) occurs when your A/B test's actual traffic split differs significantly from the expected split. If you set 50/50 but see 55/45, you have SRM. This invalidates your test results because it indicates a bug in your experiment setup. Detect it using a chi-squared test (p < 0.001 = SRM). Common causes: bot traffic, JavaScript errors, caching issues.

Who this is for

Data analysts validating A/B test results
Engineers debugging experiment implementations
Product managers who want to understand experiment validity

Who this is NOT for

Complete beginners (see our intro to A/B testing first)

What is Sample Ratio Mismatch?

Sample Ratio Mismatch (SRM) is a data quality issue in A/B testing where the observed ratio of users between variants differs significantly from the expected ratio.

✓ No SRM (Valid)

Expected: 50% / 50%
Observed: 49.8% / 50.2%
Small variance is normal

✗ SRM Detected (Invalid)

Expected: 50% / 50%
Observed: 55% / 45%
Significant deviation = bug

⚠️ Critical: If you have SRM, your test results are invalid. Do not make decisions based on conversion rates until you fix the underlying issue.

Why SRM Invalidates Your Results

SRM indicates that something is systematically different between how users are assigned or tracked in each variant. This means:

1.Selection bias: The users in each variant may not be comparable. If variant B has fewer users because of a JS error, the remaining users might be more technical (they have JS enabled).
2.Measurement error: If tracking is broken for one variant, you're not measuring the same thing in both groups.
3.Unknown confounders: The cause of SRM may also affect conversion rates in ways you can't account for.

Example: If variant B has a redirect that takes 500ms longer, impatient users leave before being tracked. The remaining users in B are more patient—and probably more likely to convert. Your "winning" variant might just have better user selection, not a better experience.

How to Detect SRM

Use a chi-squared test to compare observed vs expected sample sizes. If the p-value is below 0.001, you have SRM.

SQL Query to Detect SRM

-- Calculate SRM for a 50/50 split experiment
WITH experiment_counts AS (
  SELECT
    variant,
    COUNT(*) as observed,
    SUM(COUNT(*)) OVER () as total
  FROM experiment_assignments
  WHERE experiment_id = 'your-experiment-id'
  GROUP BY variant
),
expected AS (
  SELECT
    variant,
    observed,
    total * 0.5 as expected  -- 50% expected per variant
  FROM experiment_counts
),
chi_squared AS (
  SELECT
    SUM(POWER(observed - expected, 2) / expected) as chi_sq,
    COUNT(*) - 1 as df  -- degrees of freedom
  FROM expected
)
SELECT
  chi_sq,
  -- p-value approximation (for df=1)
  -- If chi_sq > 10.83, p < 0.001 (SRM detected)
  CASE
    WHEN chi_sq > 10.83 THEN 'SRM DETECTED - DO NOT TRUST RESULTS'
    WHEN chi_sq > 6.63 THEN 'WARNING - Possible SRM (p < 0.01)'
    ELSE 'OK - No SRM detected'
  END as srm_status
FROM chi_squared;

Python Script

from scipy import stats

def detect_srm(observed_a: int, observed_b: int, expected_ratio: float = 0.5) -> dict:
    """
    Detect Sample Ratio Mismatch using chi-squared test.
    
    Args:
        observed_a: Number of users in variant A
        observed_b: Number of users in variant B
        expected_ratio: Expected ratio for variant A (default 0.5 for 50/50)
    
    Returns:
        dict with chi_squared, p_value, and srm_detected
    """
    total = observed_a + observed_b
    expected_a = total * expected_ratio
    expected_b = total * (1 - expected_ratio)
    
    chi_sq, p_value = stats.chisquare(
        f_obs=[observed_a, observed_b],
        f_exp=[expected_a, expected_b]
    )
    
    return {
        'chi_squared': chi_sq,
        'p_value': p_value,
        'srm_detected': p_value < 0.001,
        'observed_ratio': observed_a / total,
        'expected_ratio': expected_ratio
    }

# Example usage
result = detect_srm(observed_a=5500, observed_b=4500)
print(f"SRM Detected: {result['srm_detected']}")
print(f"P-value: {result['p_value']:.6f}")
print(f"Observed ratio: {result['observed_ratio']:.2%}")

💡 Tip: Most A/B testing tools (including ExperimentHQ) automatically detect SRM and warn you. Check your dashboard for SRM alerts.

Common Causes of SRM

Bot Traffic

Bots may trigger one variant more than another, especially if variants have different URLs or load patterns.

Fix: Filter bot traffic by user-agent before analysis. Exclude known bot IPs.

JavaScript Errors

If variant B has a JS error that prevents tracking, you'll see fewer users in that variant.

Fix: Check browser console for errors. Test both variants in multiple browsers.

Redirect Timing

In redirect tests, users may leave before the redirect completes, causing uneven tracking.

Fix: Use server-side redirects or ensure client-side redirects are fast (<100ms).

Caching Issues

CDN or browser caching may serve the same variant to returning users, skewing the split.

Fix: Ensure cache headers respect experiment cookies. Use Vary: Cookie header.

Assignment Bugs

Bugs in your randomization logic may not produce true 50/50 splits.

Fix: Use proven randomization libraries. Audit your assignment code.

Browser Compatibility

If variant B uses features unsupported in older browsers, those users may not be tracked.

Fix: Test variants in all target browsers. Use feature detection.

SRM Prevention Checklist

Test tracking in both variants before launching
Filter bot traffic using user-agent detection
Use server-side assignment when possible (avoids client-side issues)
Monitor SRM daily during the experiment
Run A/A tests to validate your setup before real experiments
Check browser compatibility for all variant code
Ensure cache headers include Vary: Cookie

What to Do If You Detect SRM

1
Stop the experiment
Don't make decisions based on invalid data.
2
Investigate the cause
Check the common causes above. Look at browser breakdowns, bot traffic, and error logs.
3
Fix the issue
Address the root cause before restarting.
4
Restart with fresh data
Don't try to "fix" existing data. Start a new experiment.

Frequently Asked Questions

What p-value threshold indicates SRM?▼

Use p < 0.001 as the threshold. This is stricter than the typical 0.05 because you're testing data quality, not treatment effects. False positives here are very costly.

Can I ignore small SRM?▼

No. Even small SRM indicates a systematic issue. A 51/49 split might seem minor, but the underlying cause could be affecting your conversion data in ways you can't see.

How often should I check for SRM?▼

Check daily during the first few days of an experiment, then at least weekly. Many issues manifest early, so catching SRM quickly saves time.