Part 4: Querying AWS Services with CloudWatch Logs

Each AWS service logs differently. Over time, I've learned the quirks and patterns of each service's logs. This part provides production-ready queries for the AWS services I work with most.

AWS Lambda Logs

Lambda is perhaps the most common source of CloudWatch logs in modern AWS architectures.

Lambda Log Structure

Lambda creates log streams per function invocation and includes automatic logging:

START RequestId: abc-123 Version: $LATEST
2024-01-15T10:30:45.123Z abc-123 INFO User logged in
END RequestId: abc-123
REPORT RequestId: abc-123 Duration: 125.45 ms Billed Duration: 126 ms Memory Size: 512 MB Max Memory Used: 95 MB

Finding Errors in Lambda

fields @timestamp, @message
| filter @type = "REPORT" or @message like /ERROR|Error|error/
| filter @message not like /START|END/
| sort @timestamp desc
| limit 50

Lambda Cold Start Analysis

Lambda Performance Metrics

Lambda Timeouts

Real Example: Lambda Error Tracking

API Gateway Logs

API Gateway logs require explicit enablement but provide invaluable request/response data.

Enable API Gateway Logging

First, create a CloudWatch log role ARN in IAM, then enable logs for your API stage.

API Gateway Log Format

Query API Gateway Access Logs

API Gateway Error Rate

API Gateway Performance by Endpoint

Real Example: API Gateway 4xx vs 5xx Analysis

ECS/Fargate Container Logs

ECS container logs capture application output from Docker containers.

ECS Log Stream Format

Query ECS Container Logs

ECS Task Failures

ECS Container Resource Issues

Real Example: ECS Service Health Check

RDS Database Logs

RDS provides multiple log types: error logs, slow query logs, and general logs.

RDS Error Log Queries

RDS Slow Query Analysis

Real Example: RDS Connection Issues

VPC Flow Logs

VPC Flow Logs track network traffic for security and troubleshooting.

VPC Flow Log Format

Format: version account-id interface-id srcaddr dstaddr srcport dstport protocol packets bytes start end action log-status

Query VPC Flow Logs

VPC Traffic Volume by IP

Real Example: Security Analysis - Rejected Connections

CloudTrail Logs

CloudTrail tracks AWS API calls for security, compliance, and troubleshooting.

CloudTrail Event Structure

Query CloudTrail for Errors

CloudTrail Unauthorized Attempts

Real Example: CloudTrail Security Monitoring

Application Load Balancer Logs

ALB logs provide detailed HTTP request/response data.

ALB Log Format

Query ALB Access Logs

ALB Performance Analysis

Real Example: ALB Error Analysis

ElastiCache Logs

ElastiCache (Redis/Memcached) logs engine events and errors.

Query ElastiCache Slow Logs

ElastiCache Connection Issues

CodeBuild Logs

CodeBuild logs contain build output and errors.

Query CodeBuild Failures

CodeBuild Phase Duration

Step Functions Logs

Step Functions logs state machine executions.

Query Step Functions Failures

Cross-Service Correlation Patterns

Pattern: Trace Request Across Services

Pattern: API Gateway โ†’ Lambda โ†’ RDS

Query multiple log groups:

  1. API Gateway: /aws/apigateway/my-api

  2. Lambda: /aws/lambda/my-function

  3. RDS: /aws/rds/instance/my-db/slowquery

Service-Specific Best Practices

Lambda Best Practices

  1. Always include request ID in logs

  2. Use structured logging (JSON)

  3. Monitor cold starts separately

  4. Track memory usage trends

API Gateway Best Practices

  1. Enable execution logging (not just access logging)

  2. Include correlation IDs

  3. Monitor 4xx vs 5xx separately

  4. Track latency by endpoint

RDS Best Practices

  1. Enable slow query logs (set threshold appropriately)

  2. Monitor connection pool exhaustion

  3. Track long-running queries

  4. Alert on replication lag

ECS Best Practices

  1. Use structured logging in containers

  2. Include container ID in logs

  3. Monitor task recycling frequency

  4. Track resource utilization

Key Takeaways

  • Each AWS service has unique log formats and structures

  • Lambda logs include automatic START/END/REPORT entries

  • API Gateway requires explicit log enablement

  • VPC Flow Logs use space-delimited format

  • CloudTrail logs all API calls for security auditing

  • ALB logs provide detailed HTTP metrics

  • Use parsing to extract service-specific fields

  • Correlate across services using request/trace IDs

  • Enable appropriate log levels for each service

In Part 5, we'll build comprehensive observability dashboards using CloudWatch, combining these queries into actionable visualizations for real-time monitoring.

Last updated