Part 5: Error Handling and Interceptors in gRPC

The 3 AM Production Incident

It was 3:17 AM when I got the PagerDuty alert. Our payment processing gRPC service was returning cryptic errors: "code: 2, message: unknown error". No stack traces. No context. No idea what was broken. The mobile app showed "Something went wrong" to thousands of users trying to complete purchases.

After 2 hours of digging through logs, I found the issue: a database connection timeout. But the error handling was so poor that the real error was swallowed, making debugging nearly impossible. That night cost us approximately $45,000 in lost revenue and emergency response.

This part covers everything I learned about proper error handling in gRPC—the hard way.

gRPC Status Codes

Standard Status Codes

// src/utils/grpc-status.ts
import * as grpc from '@grpc/grpc-js';

export const GrpcStatus = {
  // Success
  OK: grpc.status.OK, // 0

  // Client errors
  CANCELLED: grpc.status.CANCELLED, // 1 - Request cancelled
  INVALID_ARGUMENT: grpc.status.INVALID_ARGUMENT, // 3 - Invalid request
  NOT_FOUND: grpc.status.NOT_FOUND, // 5 - Resource not found
  ALREADY_EXISTS: grpc.status.ALREADY_EXISTS, // 6 - Resource exists
  PERMISSION_DENIED: grpc.status.PERMISSION_DENIED, // 7 - No permission
  UNAUTHENTICATED: grpc.status.UNAUTHENTICATED, // 16 - Not authenticated
  RESOURCE_EXHAUSTED: grpc.status.RESOURCE_EXHAUSTED, // 8 - Rate limit/quota

  // Server errors
  FAILED_PRECONDITION: grpc.status.FAILED_PRECONDITION, // 9 - State issue
  ABORTED: grpc.status.ABORTED, // 10 - Concurrency conflict
  OUT_OF_RANGE: grpc.status.OUT_OF_RANGE, // 11 - Invalid range
  UNIMPLEMENTED: grpc.status.UNIMPLEMENTED, // 12 - Not implemented
  INTERNAL: grpc.status.INTERNAL, // 13 - Server error
  UNAVAILABLE: grpc.status.UNAVAILABLE, // 14 - Service unavailable
  DATA_LOSS: grpc.status.DATA_LOSS, // 15 - Data loss/corruption
  DEADLINE_EXCEEDED: grpc.status.DEADLINE_EXCEEDED, // 4 - Timeout
  UNKNOWN: grpc.status.UNKNOWN, // 2 - Unknown error
};

// Human-readable descriptions
export const StatusDescriptions: Record<number, string> = {
  [grpc.status.OK]: 'Success',
  [grpc.status.CANCELLED]: 'Request was cancelled',
  [grpc.status.INVALID_ARGUMENT]: 'Invalid request parameters',
  [grpc.status.DEADLINE_EXCEEDED]: 'Request timeout exceeded',
  [grpc.status.NOT_FOUND]: 'Resource not found',
  [grpc.status.ALREADY_EXISTS]: 'Resource already exists',
  [grpc.status.PERMISSION_DENIED]: 'Permission denied',
  [grpc.status.RESOURCE_EXHAUSTED]: 'Resource exhausted (rate limit or quota)',
  [grpc.status.FAILED_PRECONDITION]: 'Operation rejected due to system state',
  [grpc.status.ABORTED]: 'Operation aborted (typically due to concurrency issue)',
  [grpc.status.OUT_OF_RANGE]: 'Operation attempted past valid range',
  [grpc.status.UNIMPLEMENTED]: 'Operation not implemented',
  [grpc.status.INTERNAL]: 'Internal server error',
  [grpc.status.UNAVAILABLE]: 'Service unavailable',
  [grpc.status.DATA_LOSS]: 'Unrecoverable data loss or corruption',
  [grpc.status.UNAUTHENTICATED]: 'Authentication required',
  [grpc.status.UNKNOWN]: 'Unknown error',
};

Custom Error Classes

Error Handling Interceptor

Input Validation with Zod

Using Validation in Handlers

Advanced Interceptors

1. Logging Interceptor

2. Metrics Interceptor

3. Timeout Interceptor

4. Request Size Limit Interceptor

5. Correlation ID Interceptor

Retry Logic (Client-Side)

Circuit Breaker Pattern

Complete Interceptor Chain

Key Takeaways

  1. Proper Error Codes: Use appropriate gRPC status codes, not just INTERNAL

  2. Rich Error Details: Include request IDs and context for debugging

  3. Validation First: Validate all inputs before processing

  4. Interceptor Chain: Use interceptors for cross-cutting concerns

  5. Never Expose Internals: Generic errors to clients, detailed logs on server

  6. Retry + Circuit Breaker: Client-side resilience patterns

  7. Observability: Log correlation IDs, metrics, and traces

My 3 AM Lesson: Good error handling is invisible when things work, but saves your company when things break.


Next: Part 6: API Documentation and Versioning in gRPC

Series Navigation: gRPC 101 Series

Last updated