# Peer-to-Peer Architecture

## What Sparked My Interest in P2P

I got interested in peer-to-peer architecture in two ways: the first was practically using BitTorrent to download large Linux ISOs back when home broadband was slow and a direct download from a single server took hours. The second was understanding how Bitcoin worked — a distributed ledger with no central operator, maintained by consensus among thousands of nodes around the world.

Neither of those is something I built myself for a production system. But understanding P2P architecture changed how I thought about centralisation as a default assumption. It made me ask: *what does this system gain from having a central authority, and what does it lose?*

This article is more conceptual than the others in this series. P2P is not something I reach for regularly in application development, but it is foundational for understanding blockchains, distributed file systems, and decentralised protocols that are increasingly relevant.

## Table of Contents

* [What Is Peer-to-Peer Architecture?](#what-is-peer-to-peer-architecture)
* [P2P vs Client-Server](#p2p-vs-client-server)
* [Types of P2P Networks](#types-of-p2p-networks)
* [Key Concepts](#key-concepts)
* [Real-World P2P Systems I Have Studied](#real-world-p2p-systems-i-have-studied)
* [A Minimal P2P Node in Python](#a-minimal-p2p-node-in-python)
* [When P2P Makes Sense](#when-p2p-makes-sense)
* [Challenges](#challenges)
* [Lessons Learned](#lessons-learned)

***

## What Is Peer-to-Peer Architecture?

In a peer-to-peer network, **every participant (peer) acts as both client and server**. There is no central authority coordinating communication. Peers discover each other, communicate directly, and share resources (files, compute, bandwidth) without any single node being essential to the network's operation.

{% @mermaid/diagram content="graph TB
subgraph "Client-Server"
SERVER\[Central Server]
C1\[Client 1]
C2\[Client 2]
C3\[Client 3]
C4\[Client 4]

```
    C1 --> SERVER
    C2 --> SERVER
    C3 --> SERVER
    C4 --> SERVER
end

subgraph "Peer-to-Peer"
    P1[Peer 1]
    P2[Peer 2]
    P3[Peer 3]
    P4[Peer 4]
    P5[Peer 5]
    
    P1 <--> P2
    P1 <--> P3
    P2 <--> P4
    P3 <--> P4
    P4 <--> P5
    P2 <--> P5
end" %}
```

In the client-server model, the server is a single point of failure and a bottleneck. In P2P, removing any single peer does not bring down the network.

***

## P2P vs Client-Server

| Aspect                  | Client-Server                   | Peer-to-Peer                 |
| ----------------------- | ------------------------------- | ---------------------------- |
| Control                 | Centralised                     | Decentralised                |
| Single point of failure | Yes (server)                    | No                           |
| Scalability             | Server must scale               | Network scales with peers    |
| Latency                 | Depends on server               | Can route to nearest peer    |
| Trust model             | Trust the server                | Trust the protocol           |
| Complexity              | Simpler to build                | More complex                 |
| Data consistency        | Easy (one authoritative source) | Hard (distributed consensus) |

***

## Types of P2P Networks

### 1. Pure P2P

Every node is equal — there is no designated "server" node. Discovery, routing, and data transfer happen purely peer-to-peer. The Gnutella protocol used this model.

**Problem:** Difficult to search and difficult to bootstrap (how do you find anyone when you first join?).

### 2. Hybrid P2P

A central coordinator helps with peer discovery, but actual data transfer happens directly peer-to-peer. BitTorrent uses this model — a **tracker server** tells you which peers have the file, but you download from peers directly, not from the tracker.

{% @mermaid/diagram content="sequenceDiagram
participant Client as New Peer
participant Tracker as Tracker (index server)
participant Peer1 as Existing Peer A
participant Peer2 as Existing Peer B

```
Client->>Tracker: I want file X, who has it?
Tracker-->>Client: Peer A has pieces 1-50, Peer B has pieces 30-100
Client->>Peer1: Send me pieces 1-29
Client->>Peer2: Send me pieces 51-100
Peer1-->>Client: Pieces 1-29
Peer2-->>Client: Pieces 51-100

Note over Client: Now has full file.<br/>Also starts seeding to others." %}
```

### 3. Structured P2P (DHT)

**Distributed Hash Tables (DHTs)** create a structured overlay network where each node is responsible for a subset of the keyspace. Chord, Kademlia, and Pastry are well-known DHT algorithms. The BitTorrent DHT uses Kademlia.

In a DHT:

* Every node has an ID in a key space (e.g., a 160-bit integer)
* Each node is responsible for keys "close" to its ID (using XOR distance)
* To find a value, you route through nodes with IDs progressively closer to the target key
* Lookup is O(log N) in the number of nodes

***

## Key Concepts

### Peer Discovery

How does a new node find other nodes? Common mechanisms:

* **Bootstrap nodes** — hardcoded well-known nodes that a new peer contacts first
* **Local network broadcast** — announce presence over mDNS or UDP broadcast
* **Distributed Hash Table** — ask known nodes for peers closer to your ID

### Data Distribution

In a file-sharing P2P network, files are split into **chunks**. Each peer can download chunks from multiple sources simultaneously and share chunks it already has, even before it has the complete file. This is why BitTorrent downloads are faster with more seeders — more sources mean more parallel transfers.

### Consensus

In blockchain-style P2P networks, nodes must agree on a shared state (the ledger) without a central authority. Consensus algorithms like **Proof of Work**, **Proof of Stake**, and **PBFT** govern how agreement is reached and how conflicting versions are resolved.

***

## Real-World P2P Systems I Have Studied

| System                 | Type                          | What I Learned From It                                      |
| ---------------------- | ----------------------------- | ----------------------------------------------------------- |
| **BitTorrent**         | Hybrid P2P, file sharing      | Chunking + parallel downloads, tracker vs DHT               |
| **Bitcoin / Ethereum** | Pure P2P, blockchain          | Distributed consensus, Merkle trees, UTXO model             |
| **IPFS**               | Structured P2P (Kademlia DHT) | Content-addressed storage, decentralised hosting            |
| **WebRTC**             | Browser P2P                   | How browsers establish direct connections via ICE/STUN/TURN |

WebRTC is the one I have actually used in an application — building a simple real-time video call feature without a media server by establishing direct peer connections between browsers.

***

## A Minimal P2P Node in Python

Here is a simplified TCP-based P2P node to illustrate the concept — not production-ready, but useful for understanding the model:

```python
# peer_node.py — minimal P2P node
import socket
import threading
import json
from typing import set

class PeerNode:
    def __init__(self, host: str, port: int):
        self.host = host
        self.port = port
        self.peers: set[tuple[str, int]] = set()
        self._server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        self._server.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)

    def start(self):
        self._server.bind((self.host, self.port))
        self._server.listen(10)
        print(f"Node listening on {self.host}:{self.port}")
        threading.Thread(target=self._accept_connections, daemon=True).start()

    def _accept_connections(self):
        while True:
            conn, addr = self._server.accept()
            threading.Thread(
                target=self._handle_connection,
                args=(conn, addr),
                daemon=True
            ).start()

    def _handle_connection(self, conn: socket.socket, addr: tuple):
        try:
            data = conn.recv(4096).decode()
            message = json.loads(data)
            self._handle_message(message, addr)
        finally:
            conn.close()

    def _handle_message(self, message: dict, sender_addr: tuple):
        msg_type = message.get("type")

        if msg_type == "HELLO":
            # New peer announcing itself — register it
            peer_host = message["host"]
            peer_port = message["port"]
            self.peers.add((peer_host, peer_port))
            print(f"Registered peer {peer_host}:{peer_port}")
            # Share our known peers back
            self.send_to(peer_host, peer_port, {
                "type": "PEERS",
                "peers": list(self.peers)
            })

        elif msg_type == "PEERS":
            # Add newly discovered peers
            for peer in message.get("peers", []):
                self.peers.add(tuple(peer))

    def connect_to(self, host: str, port: int):
        self.peers.add((host, port))
        self.send_to(host, port, {
            "type": "HELLO",
            "host": self.host,
            "port": self.port
        })

    def send_to(self, host: str, port: int, message: dict):
        try:
            with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
                s.connect((host, port))
                s.send(json.dumps(message).encode())
        except ConnectionRefusedError:
            self.peers.discard((host, port))

    def broadcast(self, message: dict):
        for peer in list(self.peers):
            self.send_to(peer[0], peer[1], message)
```

This illustrates the core mechanics: each node listens for incoming connections, can connect to known peers, and propagates peer lists so nodes discover the network structure.

***

## When P2P Makes Sense

* **File distribution at scale** — distributing large files (software releases, datasets) where a central CDN would be expensive or a single point of failure
* **Decentralised applications** — systems that should work without any single operator: dApps on Ethereum, IPFS-hosted content
* **Real-time browser communication** — WebRTC for video/audio calls and data channels directly between browsers without a media relay server
* **Censorship-resistant networks** — networks designed to operate even when some nodes or infrastructure are blocked
* **IoT mesh networks** — sensor networks in environments without reliable central connectivity

***

## Challenges

* **Security** — without a trusted authority, how do you prevent malicious nodes? Cryptographic identity (public/private keys) and consensus rules are the answer, but they add complexity.
* **NAT traversal** — most peers are behind NAT routers. Getting two peers behind different NATs to connect directly requires STUN/TURN servers (a controlled irony — you need a central server to help peers connect directly).
* **Churn** — nodes join and leave constantly. Maintaining a consistent network topology and data availability under churn is a hard problem.
* **Data consistency** — without a single source of truth, achieving consistency requires explicit consensus mechanisms.
* **Debugging** — distributed, decentralised systems are extremely difficult to observe and debug.

***

## Lessons Learned

* **P2P is the right tool for censorship-resistance and decentralisation, not for simplicity.** If you do not need to remove the central authority, do not introduce P2P complexity.
* **BitTorrent's piece selection algorithm (rarest-first) is elegant.** Downloading the rarest pieces first ensures they stay available in the network. This kind of emergent behaviour design is a fascinating area of distributed systems.
* **WebRTC is practical P2P.** If you need real-time browser-to-browser communication, WebRTC is well-supported and handles NAT traversal for you.
* **Blockchain is one application of P2P, not a synonym for it.** Most of the interesting P2P engineering is in file systems, networking, and messaging, not blockchains.
