Sampling Strategies

The Cost of Complete Observability

Three months into production with OpenTelemetry, I got a wake-up call: my observability bill was $12,000/month. We were processing 50 million requests per day, and I was tracing every single one.

The math was brutal:

50M requests/day = ~580 requests/second
Average trace size: 15KB (across 7 microservices)
Daily data: 50M × 15KB = 750GB/day
Monthly storage: 22.5TB

At $0.50/GB storage and $0.10/GB ingestion: $13,500/month for traces alone.

The solution? Smart sampling. I kept 100% visibility into errors while sampling only 1% of successful requests. New cost: $800/month. Same debugging capability.

Understanding Sampling

Sampling means deciding which traces to keep and which to discard.

Head-Based Sampling

Decision made at trace creation (the "head" of the trace).

Pros:

Low overhead
Decision made early
Easy to implement

Cons:

Can't see future (don't know if trace will error)
Might discard interesting traces
No post-filtering

Tail-Based Sampling

Decision made after trace completes (the "tail").

Pros:

Can see entire trace before deciding
Keep all errors, slow requests
More intelligent decisions

Cons:

Higher overhead
Requires buffering
Needs centralized collector

Built-In Samplers

1. Always On (Don't Use in Production!)

import { NodeSDK } from '@opentelemetry/sdk-node';
import { AlwaysOnSampler } from '@opentelemetry/sdk-trace-base';

const sdk = new NodeSDK({
  sampler: new AlwaysOnSampler(),
  // ... other config
});

Keeps every trace. Only use for development or very low-volume services.

2. Always Off (Also Don't Use!)

import { AlwaysOffSampler } from '@opentelemetry/sdk-trace-base';

const sdk = new NodeSDK({
  sampler: new AlwaysOffSampler(),
  // ... other config
});

Discards every trace. Why even run OTel?

3. Ratio-Based Sampling (Production Standard)

import { TraceIdRatioBasedSampler } from '@opentelemetry/sdk-trace-base';

const sdk = new NodeSDK({
  // Sample 5% of traces
  sampler: new TraceIdRatioBasedSampler(0.05),
  // ... other config
});

How it works: Uses trace ID's hash to determine sampling. Same trace ID always gets same decision.

Use when: You want simple, stateless sampling.

4. Parent-Based Sampling (The Smart Default)

import { ParentBasedSampler, TraceIdRatioBasedSampler } from '@opentelemetry/sdk-trace-base';

const sdk = new NodeSDK({
  sampler: new ParentBasedSampler({
    // Root spans (no parent): sample 10%
    root: new TraceIdRatioBasedSampler(0.1),
    
    // If parent was sampled, sample this too
    // If parent was not sampled, don't sample this
    // (This keeps distributed traces consistent)
  }),
  // ... other config
});

Critical for distributed tracing: Ensures all services in a trace make the same sampling decision.

Example:

API Gateway samples 10% → samples trace abc123
Order Service sees parent sampled → also samples trace abc123
Payment Service sees parent sampled → also samples trace abc123

Result: Complete trace, or no trace. Never partial.

Custom Sampling: The Production Solution

Here's the sampler I actually use in production:

import { Sampler, SamplingDecision, SamplingResult, Context, Link, SpanKind, Attributes } from '@opentelemetry/sdk-trace-base';
import { trace } from '@opentelemetry/api';

interface SmartSamplerConfig {
  defaultRate: number;
  errorRate: number;
  slowRequestThreshold: number;
  slowRequestRate: number;
  debugUserRate: number;
  priorityEndpoints: string[];
}

export class SmartSampler implements Sampler {
  private config: SmartSamplerConfig;
  
  constructor(config: Partial<SmartSamplerConfig> = {}) {
    this.config = {
      defaultRate: 0.01,              // 1% of normal requests
      errorRate: 1.0,                 // 100% of errors
      slowRequestThreshold: 1000,     // Requests > 1s are "slow"
      slowRequestRate: 0.5,           // 50% of slow requests
      debugUserRate: 1.0,             // 100% for debug users
      priorityEndpoints: ['/api/checkout', '/api/payment'],
      ...config,
    };
  }
  
  shouldSample(
    context: Context,
    traceId: string,
    spanName: string,
    spanKind: SpanKind,
    attributes: Attributes,
    links: Link[]
  ): SamplingResult {
    const parentSpan = trace.getSpan(context);
    const parentSampled = parentSpan?.spanContext().traceFlags === 1;
    
    // Always respect parent's sampling decision (for distributed tracing)
    if (parentSampled !== undefined && parentSampled) {
      return {
        decision: SamplingDecision.RECORD_AND_SAMPLED,
        attributes: { 'sampling.reason': 'parent_sampled' },
      };
    }
    
    // Always sample errors
    if (attributes['error'] === true ||
        attributes['http.status_code'] as number >= 500) {
      return {
        decision: SamplingDecision.RECORD_AND_SAMPLED,
        attributes: { 'sampling.reason': 'error' },
      };
    }
    
    // Always sample debug users
    if (attributes['user.debug'] === true ||
        attributes['http.headers.x-debug'] === 'true') {
      return {
        decision: SamplingDecision.RECORD_AND_SAMPLED,
        attributes: { 'sampling.reason': 'debug_user' },
      };
    }
    
    // Sample slow requests more frequently
    const duration = attributes['http.request.duration'] as number;
    if (duration && duration > this.config.slowRequestThreshold) {
      if (Math.random() < this.config.slowRequestRate) {
        return {
          decision: SamplingDecision.RECORD_AND_SAMPLED,
          attributes: { 'sampling.reason': 'slow_request' },
        };
      }
    }
    
    // Higher sampling for priority endpoints
    const route = attributes['http.route'] as string;
    if (route && this.config.priorityEndpoints.includes(route)) {
      if (Math.random() < 0.1) { // 10% for critical endpoints
        return {
          decision: SamplingDecision.RECORD_AND_SAMPLED,
          attributes: { 'sampling.reason': 'priority_endpoint' },
        };
      }
    }
    
    // Default sampling rate
    if (Math.random() < this.config.defaultRate) {
      return {
        decision: SamplingDecision.RECORD_AND_SAMPLED,
        attributes: { 'sampling.reason': 'random_sample' },
      };
    }
    
    return {
      decision: SamplingDecision.NOT_RECORD,
      attributes: { 'sampling.reason': 'not_sampled' },
    };
  }
  
  toString(): string {
    return 'SmartSampler';
  }
}

// Usage
const sdk = new NodeSDK({
  sampler: new SmartSampler({
    defaultRate: 0.01,              // 1% baseline
    errorRate: 1.0,                 // All errors
    slowRequestThreshold: 2000,     // > 2 seconds
    priorityEndpoints: ['/api/checkout', '/api/orders'],
  }),
  // ... other config
});

Measuring Request Duration for Sampling

The problem: How do you know if a request is "slow" at the start of the trace?

Answer: You don't. But you can make an educated guess:

import { Sampler, SamplingDecision, SamplingResult, Context, SpanKind, Attributes } from '@opentelemetry/sdk-trace-base';

export class DurationAwareSampler implements Sampler {
  private requestStartTimes = new Map<string, number>();
  
  shouldSample(
    context: Context,
    traceId: string,
    spanName: string,
    spanKind: SpanKind,
    attributes: Attributes
  ): SamplingResult {
    // For root spans, record start time
    if (spanKind === SpanKind.SERVER && !trace.getSpan(context)) {
      this.requestStartTimes.set(traceId, Date.now());
    }
    
    // For child spans, check if parent is taking too long
    const startTime = this.requestStartTimes.get(traceId);
    if (startTime) {
      const elapsed = Date.now() - startTime;
      
      // If request already taking > 1s, sample aggressively
      if (elapsed > 1000) {
        return {
          decision: SamplingDecision.RECORD_AND_SAMPLED,
          attributes: { 'sampling.reason': 'slow_in_progress' },
        };
      }
    }
    
    // Default logic...
    return {
      decision: SamplingDecision.RECORD_AND_SAMPLED,
      attributes: { 'sampling.reason': 'default' },
    };
  }
  
  toString(): string {
    return 'DurationAwareSampler';
  }
}

Warning: This has memory implications. Clean up old entries:

setInterval(() => {
  const now = Date.now();
  for (const [traceId, startTime] of this.requestStartTimes.entries()) {
    if (now - startTime > 60000) { // 1 minute
      this.requestStartTimes.delete(traceId);
    }
  }
}, 10000); // Clean up every 10 seconds

Tail-Based Sampling with Collector

For true tail-based sampling, use the OpenTelemetry Collector:

collector-config.yaml:

receivers:
  otlp:
    protocols:
      http:
        endpoint: 0.0.0.0:4318
      grpc:
        endpoint: 0.0.0.0:4317

processors:
  # Always sample errors and slow requests
  tail_sampling:
    decision_wait: 10s  # Wait 10s for trace to complete
    num_traces: 100000  # Buffer up to 100k traces
    expected_new_traces_per_sec: 1000
    policies:
      # Policy 1: Always sample errors
      - name: error-policy
        type: status_code
        status_code:
          status_codes:
            - ERROR
      
      # Policy 2: Always sample slow requests (>2s)
      - name: slow-request-policy
        type: latency
        latency:
          threshold_ms: 2000
      
      # Policy 3: Sample 1% of everything else
      - name: random-policy
        type: probabilistic
        probabilistic:
          sampling_percentage: 1

exporters:
  otlp:
    endpoint: jaeger:4317
    tls:
      insecure: true

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [tail_sampling]
      exporters: [otlp]

Run the collector:

docker run -d --name otel-collector \
  -p 4317:4317 \
  -p 4318:4318 \
  -v $(pwd)/collector-config.yaml:/etc/otel-collector-config.yaml \
  otel/opentelemetry-collector:latest \
  --config=/etc/otel-collector-config.yaml

Update your application to send to collector:

const sdk = new NodeSDK({
  traceExporter: new OTLPTraceExporter({
    url: 'http://localhost:4318/v1/traces', // Collector, not Jaeger
  }),
  // Use AlwaysOnSampler - let collector decide
  sampler: new AlwaysOnSampler(),
});

Sampling Metrics: What Am I Actually Keeping?

Track your sampling decisions:

import { metrics } from '@opentelemetry/api';

const meter = metrics.getMeter('sampling-metrics');

const sampledCounter = meter.createCounter('traces.sampled', {
  description: 'Number of traces sampled',
});

const droppedCounter = meter.createCounter('traces.dropped', {
  description: 'Number of traces dropped',
});

export class MetricsSampler implements Sampler {
  private baseSampler: Sampler;
  
  constructor(baseSampler: Sampler) {
    this.baseSampler = baseSampler;
  }
  
  shouldSample(
    context: Context,
    traceId: string,
    spanName: string,
    spanKind: SpanKind,
    attributes: Attributes,
    links: Link[]
  ): SamplingResult {
    const result = this.baseSampler.shouldSample(
      context,
      traceId,
      spanName,
      spanKind,
      attributes,
      links
    );
    
    if (result.decision === SamplingDecision.RECORD_AND_SAMPLED) {
      sampledCounter.add(1, {
        'sampling.reason': result.attributes?.['sampling.reason'] as string || 'unknown',
      });
    } else {
      droppedCounter.add(1);
    }
    
    return result;
  }
  
  toString(): string {
    return `MetricsSampler(${this.baseSampler.toString()})`;
  }
}

// Usage
const sdk = new NodeSDK({
  sampler: new MetricsSampler(
    new SmartSampler({ defaultRate: 0.01 })
  ),
});

Query in Prometheus:

# Sampling rate
rate(traces_sampled_total[5m]) / (rate(traces_sampled_total[5m]) + rate(traces_dropped_total[5m]))

# Sampling by reason
sum by (sampling_reason) (rate(traces_sampled_total[5m]))

Real Production Sampling Strategy

Here's what I actually run:

import { NodeSDK } from '@opentelemetry/sdk-node';
import { ParentBasedSampler } from '@opentelemetry/sdk-trace-base';

const sdk = new NodeSDK({
  sampler: new ParentBasedSampler({
    root: new SmartSampler({
      // 1% baseline for successful requests
      defaultRate: 0.01,
      
      // 100% of errors
      errorRate: 1.0,
      
      // 50% of requests >2s
      slowRequestThreshold: 2000,
      slowRequestRate: 0.5,
      
      // 100% of debug users
      debugUserRate: 1.0,
      
      // 10% of critical endpoints
      priorityEndpoints: [
        '/api/checkout',
        '/api/payment',
        '/api/orders',
      ],
    }),
  }),
  // ... other config
});

Results at 50M requests/day:

Adaptive Sampling (Advanced)

Dynamically adjust sampling based on load:

export class AdaptiveSampler implements Sampler {
  private currentRate = 0.1; // Start at 10%
  private targetTracesPerSecond = 100;
  private actualTracesPerSecond = 0;
  private lastAdjustment = Date.now();
  
  constructor() {
    // Adjust every 60 seconds
    setInterval(() => this.adjustRate(), 60000);
  }
  
  private adjustRate() {
    const now = Date.now();
    const elapsed = (now - this.lastAdjustment) / 1000;
    
    const rate = this.actualTracesPerSecond / elapsed;
    
    if (rate > this.targetTracesPerSecond * 1.2) {
      // Too many traces, reduce sampling
      this.currentRate *= 0.8;
    } else if (rate < this.targetTracesPerSecond * 0.8) {
      // Too few traces, increase sampling
      this.currentRate *= 1.2;
    }
    
    // Clamp between 0.1% and 100%
    this.currentRate = Math.max(0.001, Math.min(1.0, this.currentRate));
    
    console.log(`Adjusted sampling rate to ${(this.currentRate * 100).toFixed(2)}%`);
    
    this.actualTracesPerSecond = 0;
    this.lastAdjustment = now;
  }
  
  shouldSample(
    context: Context,
    traceId: string,
    spanName: string,
    spanKind: SpanKind,
    attributes: Attributes
  ): SamplingResult {
    const sample = Math.random() < this.currentRate;
    
    if (sample) {
      this.actualTracesPerSecond++;
      return {
        decision: SamplingDecision.RECORD_AND_SAMPLED,
        attributes: { 'sampling.rate': this.currentRate },
      };
    }
    
    return { decision: SamplingDecision.NOT_RECORD };
  }
  
  toString(): string {
    return 'AdaptiveSampler';
  }
}

Best Practices

Always sample errors - you can't debug what you don't see
Use parent-based sampling for distributed traces
Track sampling metrics to understand what you're keeping
Start conservative (1%), increase if needed
Monitor costs - set budget alerts
Test sampling in staging before production
Document sampling logic for your team

Common Pitfalls

❌ Sampling after the fact

// BAD: Trace already created, wasted resources
if (shouldKeepTrace()) {
  await exportTrace();
}

✅ Sample at creation

// GOOD: Decision made early
sampler.shouldSample(...)

❌ Different sampling rates per service

// BAD: Partial traces
gateway: 10% sampling
orderService: 5% sampling  // Might drop gateway's sampled traces!

✅ Parent-based sampling

// GOOD: Consistent across services
new ParentBasedSampler(...)

What's Next

Continue to Resource Detection where you'll learn:

Automatic service identification
Environment metadata
Deployment information
Custom resource attributes

Previous: ← Distributed Tracing | Next: Resource Detection →

Sample smart, not hard. Keep what matters.

PreviousDistributed Tracing NextResource Detection

Last updated 1 month ago

hashtagThe Cost of Complete Observability

hashtagUnderstanding Sampling

hashtagHead-Based Sampling

hashtagTail-Based Sampling

hashtagBuilt-In Samplers

hashtag1. Always On (Don't Use in Production!)

hashtag2. Always Off (Also Don't Use!)

hashtag3. Ratio-Based Sampling (Production Standard)

hashtag4. Parent-Based Sampling (The Smart Default)

hashtagCustom Sampling: The Production Solution

hashtagMeasuring Request Duration for Sampling

hashtagTail-Based Sampling with Collector

hashtagSampling Metrics: What Am I Actually Keeping?

hashtagReal Production Sampling Strategy

hashtagAdaptive Sampling (Advanced)

hashtagBest Practices

hashtagCommon Pitfalls

hashtagWhat's Next