Part 4: Probability and Statistics
The A/B Test That Almost Cost Us
Control: 152/1000 = 15.2% conversion
Test: 165/1000 = 16.5% conversion
Improvement: 8.6%Probability: Quantifying Uncertainty
The Basics
import numpy as np
from collections import Counter
# Simulate rolling a fair die
def roll_die(n_rolls=1000):
"""Simulate rolling a six-sided die"""
return np.random.randint(1, 7, size=n_rolls)
rolls = roll_die(10000)
probabilities = Counter(rolls)
print("Probability of each outcome:")
for outcome in sorted(probabilities.keys()):
prob = probabilities[outcome] / len(rolls)
print(f"{outcome}: {prob:.3f} (expected: 0.167)")
# Law of large numbers: more rolls β closer to theoretical probability
for n in [100, 1000, 10000, 100000]:
rolls = roll_die(n)
prob_six = np.sum(rolls == 6) / n
print(f"{n:6d} rolls: P(6) = {prob_six:.4f}")Probability Distributions
Bernoulli Distribution (yes/no events)
Binomial Distribution (multiple yes/no trials)
Normal Distribution (bell curve)
Real Application: Anomaly Detection
Bayesian Thinking
Bayes' Theorem
Real Example: Spam Classification
Hypothesis Testing and A/B Tests
The Right Way to Do A/B Testing
Sample Size Calculation
Correlation vs Causation
Confidence Intervals
Key Takeaways
What's Next
Navigation
Last updated