# Serverless Architecture

## The First Time I Deployed a Function and Forgot About It

The first serverless function I deployed was a simple webhook handler. An IoT monitoring project I was working on needed to receive sensor data from a third-party platform, transform the payload, and store it in a database. I wrote a 40-line Python function, deployed it to AWS Lambda, and pointed the webhook at it.

It ran for eight months without my attention. No patches, no container restarts, no capacity planning. It handled 20 requests per day on average, with occasional spikes to 2,000 when a sensor batch published. The bill was essentially zero.

That experience taught me what serverless is genuinely good at: **event-triggered workloads with irregular traffic that do not need persistent state or long-running processes**.

## Table of Contents

* [What Is Serverless?](#what-is-serverless)
* [Core Characteristics](#core-characteristics)
* [Function-as-a-Service (FaaS)](#function-as-a-service-faas)
* [Practical Example: IoT Data Ingestion](#practical-example-iot-data-ingestion)
* [Event Sources](#event-sources)
* [Cold Start Problem](#cold-start-problem)
* [Backend-as-a-Service (BaaS)](#backend-as-a-service-baas)
* [When Serverless Makes Sense](#when-serverless-makes-sense)
* [When to Avoid Serverless](#when-to-avoid-serverless)
* [Lessons Learned](#lessons-learned)

***

## What Is Serverless?

Serverless does not mean no servers — it means **you do not manage the servers**. The cloud provider allocates compute on demand, executes your function, and deallocates the resources when the function finishes.

{% @mermaid/diagram content="graph LR
subgraph "Traditional Deployment"
SERVER\[Always-on Server<br/>Billed 24/7]
APP\[Application Process<br/>Always running]
SERVER --> APP
end

```
subgraph "Serverless"
    EVENT[Event / Trigger]
    PLATFORM[Platform provisions<br/>compute on demand]
    FN[Function executes]
    TEARDOWN[Resources released]
    
    EVENT --> PLATFORM
    PLATFORM --> FN
    FN --> TEARDOWN
end" %}
```

You pay only for the time your code runs, not for idle capacity.

***

## Core Characteristics

* **Event-driven:** Functions are triggered by events — HTTP requests, queue messages, file uploads, scheduled timers, database changes
* **Stateless:** Each function invocation is independent. No in-memory state persists between invocations
* **Ephemeral:** The execution environment may be reused (warm start) or created fresh (cold start) — you cannot rely on either
* **Auto-scaling:** The platform handles scaling from 0 to thousands of concurrent executions without configuration
* **Billed per execution:** You pay for invocation count and execution duration, not for reserved capacity

***

## Function-as-a-Service (FaaS)

The primary serverless model is FaaS, where you deploy individual functions:

| Provider     | Service                               |
| ------------ | ------------------------------------- |
| AWS          | Lambda                                |
| Azure        | Functions                             |
| Google Cloud | Cloud Functions / Cloud Run Functions |
| Cloudflare   | Workers                               |

I have used AWS Lambda and Azure Functions in personal projects. The programming model is the same: write a handler function that receives an event and returns a response.

```python
# AWS Lambda handler — IoT sensor data ingestion
import json
import boto3
import psycopg2
from datetime import datetime
from os import environ

def lambda_handler(event, context):
    """
    Triggered by API Gateway or IoT webhook.
    Parses sensor payload and writes to PostgreSQL RDS.
    """
    try:
        body = json.loads(event.get("body", "{}"))
        device_id = body["device_id"]
        readings = body["readings"]
        timestamp = datetime.utcnow()

        conn = psycopg2.connect(environ["DB_URL"])
        cur = conn.cursor()

        cur.executemany(
            """
            INSERT INTO sensor_readings (device_id, metric, value, recorded_at)
            VALUES (%s, %s, %s, %s)
            """,
            [
                (device_id, r["metric"], r["value"], timestamp)
                for r in readings
            ]
        )

        conn.commit()
        cur.close()
        conn.close()

        return {
            "statusCode": 200,
            "body": json.dumps({"stored": len(readings)})
        }

    except KeyError as e:
        return {
            "statusCode": 400,
            "body": json.dumps({"error": f"Missing field: {e}"})
        }
```

This function has no persistent state, no long-running process, and scales automatically with traffic.

***

## Practical Example: IoT Data Ingestion

Here is the architecture I used for the IoT monitoring project:

{% @mermaid/diagram content="graph TB
SENSORS\[IoT Sensors]
GATEWAY\[IoT Platform Webhook]

```
subgraph AWS
    APIGW[API Gateway]
    INGEST[Lambda: Ingest]
    TRANSFORM[Lambda: Transform]
    ALERT[Lambda: Alert]
    SQS[SQS Queue]
    RDS[(PostgreSQL RDS)]
    SNS[SNS Topic]
end

NOTIFICATION[Slack / Email]

SENSORS --> GATEWAY
GATEWAY --> APIGW
APIGW --> INGEST
INGEST --> SQS
SQS --> TRANSFORM
TRANSFORM --> RDS
TRANSFORM --> |anomaly detected| SNS
SNS --> ALERT
ALERT --> NOTIFICATION" %}
```

Each Lambda has a single responsibility:

* **Ingest:** Validate and queue the raw payload
* **Transform:** Normalise the sensor data format and persist
* **Alert:** Check thresholds and send notifications

Each can be deployed, scaled, and debugged independently.

***

## Event Sources

Functions can be triggered by many event types:

{% @mermaid/diagram content="graph LR
HTTP\[HTTP Request<br/>API Gateway]
QUEUE\[Queue Message<br/>SQS / Service Bus]
TIMER\[Scheduled Timer<br/>CloudWatch Events / Cron]
STORAGE\[File Upload<br/>S3 / Blob Storage]
DB\[Database Change<br/>DynamoDB Streams / Cosmos DB]
STREAM\[Event Stream<br/>Kinesis / Event Hubs]

```
FN[Function]

HTTP --> FN
QUEUE --> FN
TIMER --> FN
STORAGE --> FN
DB --> FN
STREAM --> FN" %}
```

I use timer-triggered functions frequently for scheduled jobs — database cleanup, report generation, health checks — where previously I would have set up a cron job on a server.

```python
# Azure Function — scheduled database cleanup
# Runs every night at 2 AM UTC
import azure.functions as func
import psycopg2
from os import environ
from datetime import datetime, timedelta

app = func.FunctionApp()

@app.timer_trigger(schedule="0 0 2 * * *", arg_name="timer")
def nightly_cleanup(timer: func.TimerRequest) -> None:
    cutoff = datetime.utcnow() - timedelta(days=90)

    conn = psycopg2.connect(environ["DB_URL"])
    cur = conn.cursor()
    cur.execute(
        "DELETE FROM sensor_readings WHERE recorded_at < %s",
        (cutoff,)
    )
    deleted = cur.rowcount
    conn.commit()
    cur.close()
    conn.close()

    print(f"Cleaned up {deleted} records older than {cutoff.date()}")
```

No always-on server needed. This runs for a few seconds once per day and costs a fraction of a cent.

***

## Cold Start Problem

When a function has not been invoked recently, the platform needs to provision a new execution environment — unpacking the runtime, loading dependencies, and initialising your code. This adds latency (typically 100ms–2s depending on the runtime and package size).

For my IoT webhook, cold starts were not a problem — sensor data ingestion can tolerate a few hundred milliseconds of extra latency.

For a user-facing API endpoint, a cold start on the first request within a session is noticeable.

Mitigations I use:

* **Keep dependencies small.** Large packages (pandas, torch) dramatically increase cold start time
* **Use provisioned concurrency** for latency-sensitive functions (AWS Lambda) — keep warm instances ready
* **Prefer compiled languages or slim runtimes** for latency-sensitive functions (Go, Rust, Node.js start faster than Python with heavy dependencies)
* **Keep initialisation code outside the handler** — database connection setup at module level is reused across warm invocations

***

## Backend-as-a-Service (BaaS)

Serverless also encompasses managed backend services that replace infrastructure you would otherwise run yourself:

| BaaS                | Replaces                           |
| ------------------- | ---------------------------------- |
| Auth0 / Cognito     | Authentication service             |
| Firebase / Supabase | Database + realtime + auth         |
| Cloudinary          | Image/video processing and storage |
| SendGrid / SES      | Email delivery                     |
| Twilio              | SMS and voice                      |

In my IoT project, I used SNS for notifications instead of building and hosting a notification service. That is the BaaS mindset: use a managed service for anything that is not core to your domain.

***

## When Serverless Makes Sense

* **Event-driven workloads:** Webhooks, queue processors, file processing triggers
* **Scheduled jobs:** Nightly reports, cleanup tasks, health checks
* **Highly variable traffic:** Spikes from 10 to 10,000 requests without capacity planning
* **Side-channel processing:** Sending emails, processing images, generating PDFs after a main request
* **Low-cost experiments:** Running a new feature or integration without provisioning infrastructure
* **Simple API surfaces:** A small set of endpoints with straightforward logic

***

## When to Avoid Serverless

* **Long-running processes:** Functions have execution time limits (15 minutes on Lambda). Batch jobs, ML training, and long ETLs are not good fits
* **High-frequency, latency-sensitive APIs:** Cold starts and function invocation overhead add up at high QPS
* **Stateful workflows:** Managing state across multiple function invocations with external storage is operationally complex — use a proper workflow engine or long-running service
* **Local development bottleneck:** Testing locally with Lambda emulators (SAM, LocalStack) adds friction compared to running a plain web server
* **Cost at high volume:** Serverless can be more expensive than a reserved instance when traffic is consistently high and predictable

***

## Lessons Learned

* **Serverless excels at the edges of a system, not the core.** My main API is a long-running FastAPI service; the event processing and scheduled jobs around it are Lambda functions.
* **Treat each function as a microservice with its own deployment.** Monolithic Lambda functions that do too much are hard to debug and update.
* **Manage cold starts deliberately.** Do not ignore them and then be surprised when a user complains about intermittent latency.
* **External state is not optional.** Stateless functions need external stores — databases, caches, queues. Design for that dependency.
* **The operational overhead of serverless is low, not zero.** You still need logging, alerting, error tracking, and cost monitoring.
