Peer-to-Peer Architecture

What Sparked My Interest in P2P

I got interested in peer-to-peer architecture in two ways: the first was practically using BitTorrent to download large Linux ISOs back when home broadband was slow and a direct download from a single server took hours. The second was understanding how Bitcoin worked — a distributed ledger with no central operator, maintained by consensus among thousands of nodes around the world.

Neither of those is something I built myself for a production system. But understanding P2P architecture changed how I thought about centralisation as a default assumption. It made me ask: what does this system gain from having a central authority, and what does it lose?

This article is more conceptual than the others in this series. P2P is not something I reach for regularly in application development, but it is foundational for understanding blockchains, distributed file systems, and decentralised protocols that are increasingly relevant.

What Is Peer-to-Peer Architecture?

In a peer-to-peer network, every participant (peer) acts as both client and server. There is no central authority coordinating communication. Peers discover each other, communicate directly, and share resources (files, compute, bandwidth) without any single node being essential to the network's operation.

In the client-server model, the server is a single point of failure and a bottleneck. In P2P, removing any single peer does not bring down the network.

P2P vs Client-Server

Aspect

Client-Server

Peer-to-Peer

Control

Centralised

Decentralised

Single point of failure

Yes (server)

Scalability

Server must scale

Network scales with peers

Latency

Depends on server

Can route to nearest peer

Trust model

Trust the server

Trust the protocol

Complexity

Simpler to build

More complex

Data consistency

Easy (one authoritative source)

Hard (distributed consensus)

Types of P2P Networks

1. Pure P2P

Every node is equal — there is no designated "server" node. Discovery, routing, and data transfer happen purely peer-to-peer. The Gnutella protocol used this model.

Problem: Difficult to search and difficult to bootstrap (how do you find anyone when you first join?).

2. Hybrid P2P

A central coordinator helps with peer discovery, but actual data transfer happens directly peer-to-peer. BitTorrent uses this model — a tracker server tells you which peers have the file, but you download from peers directly, not from the tracker.

3. Structured P2P (DHT)

Distributed Hash Tables (DHTs) create a structured overlay network where each node is responsible for a subset of the keyspace. Chord, Kademlia, and Pastry are well-known DHT algorithms. The BitTorrent DHT uses Kademlia.

In a DHT:

Every node has an ID in a key space (e.g., a 160-bit integer)
Each node is responsible for keys "close" to its ID (using XOR distance)
To find a value, you route through nodes with IDs progressively closer to the target key
Lookup is O(log N) in the number of nodes

Key Concepts

Peer Discovery

How does a new node find other nodes? Common mechanisms:

Bootstrap nodes — hardcoded well-known nodes that a new peer contacts first
Local network broadcast — announce presence over mDNS or UDP broadcast
Distributed Hash Table — ask known nodes for peers closer to your ID

Data Distribution

In a file-sharing P2P network, files are split into chunks. Each peer can download chunks from multiple sources simultaneously and share chunks it already has, even before it has the complete file. This is why BitTorrent downloads are faster with more seeders — more sources mean more parallel transfers.

Consensus

In blockchain-style P2P networks, nodes must agree on a shared state (the ledger) without a central authority. Consensus algorithms like Proof of Work, Proof of Stake, and PBFT govern how agreement is reached and how conflicting versions are resolved.

Real-World P2P Systems I Have Studied

System

Type

What I Learned From It

BitTorrent

Hybrid P2P, file sharing

Chunking + parallel downloads, tracker vs DHT

Bitcoin / Ethereum

Pure P2P, blockchain

Distributed consensus, Merkle trees, UTXO model

IPFS

Structured P2P (Kademlia DHT)

Content-addressed storage, decentralised hosting

WebRTC

Browser P2P

How browsers establish direct connections via ICE/STUN/TURN

WebRTC is the one I have actually used in an application — building a simple real-time video call feature without a media server by establishing direct peer connections between browsers.

A Minimal P2P Node in Python

Here is a simplified TCP-based P2P node to illustrate the concept — not production-ready, but useful for understanding the model:

# peer_node.py — minimal P2P node
import socket
import threading
import json
from typing import set

class PeerNode:
    def __init__(self, host: str, port: int):
        self.host = host
        self.port = port
        self.peers: set[tuple[str, int]] = set()
        self._server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        self._server.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)

    def start(self):
        self._server.bind((self.host, self.port))
        self._server.listen(10)
        print(f"Node listening on {self.host}:{self.port}")
        threading.Thread(target=self._accept_connections, daemon=True).start()

    def _accept_connections(self):
        while True:
            conn, addr = self._server.accept()
            threading.Thread(
                target=self._handle_connection,
                args=(conn, addr),
                daemon=True
            ).start()

    def _handle_connection(self, conn: socket.socket, addr: tuple):
        try:
            data = conn.recv(4096).decode()
            message = json.loads(data)
            self._handle_message(message, addr)
        finally:
            conn.close()

    def _handle_message(self, message: dict, sender_addr: tuple):
        msg_type = message.get("type")

        if msg_type == "HELLO":
            # New peer announcing itself — register it
            peer_host = message["host"]
            peer_port = message["port"]
            self.peers.add((peer_host, peer_port))
            print(f"Registered peer {peer_host}:{peer_port}")
            # Share our known peers back
            self.send_to(peer_host, peer_port, {
                "type": "PEERS",
                "peers": list(self.peers)
            })

        elif msg_type == "PEERS":
            # Add newly discovered peers
            for peer in message.get("peers", []):
                self.peers.add(tuple(peer))

    def connect_to(self, host: str, port: int):
        self.peers.add((host, port))
        self.send_to(host, port, {
            "type": "HELLO",
            "host": self.host,
            "port": self.port
        })

    def send_to(self, host: str, port: int, message: dict):
        try:
            with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
                s.connect((host, port))
                s.send(json.dumps(message).encode())
        except ConnectionRefusedError:
            self.peers.discard((host, port))

    def broadcast(self, message: dict):
        for peer in list(self.peers):
            self.send_to(peer[0], peer[1], message)

This illustrates the core mechanics: each node listens for incoming connections, can connect to known peers, and propagates peer lists so nodes discover the network structure.

When P2P Makes Sense

File distribution at scale — distributing large files (software releases, datasets) where a central CDN would be expensive or a single point of failure
Decentralised applications — systems that should work without any single operator: dApps on Ethereum, IPFS-hosted content
Real-time browser communication — WebRTC for video/audio calls and data channels directly between browsers without a media relay server
Censorship-resistant networks — networks designed to operate even when some nodes or infrastructure are blocked
IoT mesh networks — sensor networks in environments without reliable central connectivity

Challenges

Security — without a trusted authority, how do you prevent malicious nodes? Cryptographic identity (public/private keys) and consensus rules are the answer, but they add complexity.
NAT traversal — most peers are behind NAT routers. Getting two peers behind different NATs to connect directly requires STUN/TURN servers (a controlled irony — you need a central server to help peers connect directly).
Churn — nodes join and leave constantly. Maintaining a consistent network topology and data availability under churn is a hard problem.
Data consistency — without a single source of truth, achieving consistency requires explicit consensus mechanisms.
Debugging — distributed, decentralised systems are extremely difficult to observe and debug.

Lessons Learned

P2P is the right tool for censorship-resistance and decentralisation, not for simplicity. If you do not need to remove the central authority, do not introduce P2P complexity.
BitTorrent's piece selection algorithm (rarest-first) is elegant. Downloading the rarest pieces first ensures they stay available in the network. This kind of emergent behaviour design is a fascinating area of distributed systems.
WebRTC is practical P2P. If you need real-time browser-to-browser communication, WebRTC is well-supported and handles NAT traversal for you.
Blockchain is one application of P2P, not a synonym for it. Most of the interesting P2P engineering is in file systems, networking, and messaging, not blockchains.

PreviousDistributed Systems NextApplication Patterns

Last updated 5 days ago

hashtagWhat Sparked My Interest in P2P

hashtagTable of Contents

hashtagWhat Is Peer-to-Peer Architecture?

hashtagP2P vs Client-Server

hashtagTypes of P2P Networks

hashtag1. Pure P2P

hashtag2. Hybrid P2P

hashtag3. Structured P2P (DHT)

hashtagKey Concepts

hashtagPeer Discovery

hashtagData Distribution

hashtagConsensus

hashtagReal-World P2P Systems I Have Studied

hashtagA Minimal P2P Node in Python

hashtagWhen P2P Makes Sense

hashtagChallenges

hashtagLessons Learned