Concurrency - Goroutines and Channels

The Microservice That Processed 10,000 Records in 12 Minutes

It was Thursday afternoon when my teammate pinged me: "The data sync job is taking forever. Can you take a look?"

I pulled up the logs. Our microservice was processing 10,000 user records from an external API, updating our database for each one. Elapsed time: 12 minutes, 34 seconds.

The Python code looked innocent enough:

def sync_users():
    user_ids = get_all_user_ids()  # 10,000 IDs
    
    for user_id in user_ids:
        user_data = fetch_user_from_api(user_id)  # ~70ms per call
        update_database(user_data)  # ~5ms per call
    
    print(f"Synced {len(user_ids)} users")

The math: 10,000 users × 75ms per user = 750,000ms = 12.5 minutes.

We'd tried multiprocessing, but the overhead of spawning processes was eating into our gains. Threading helped a bit, but the GIL limited true parallelism.

That weekend, I rewrote it in Go:

func syncUsers() error {
    userIDs, err := getAllUserIDs()
    if err != nil {
        return err
    }
    
    var wg sync.WaitGroup
    results := make(chan error, len(userIDs))
    
    // Process 50 users concurrently
    sem := make(chan struct{}, 50)
    
    for _, id := range userIDs {
        wg.Add(1)
        go func(userID string) {
            defer wg.Done()
            sem <- struct{}{}        // Acquire semaphore
            defer func() { <-sem }() // Release semaphore
            
            userData, err := fetchUserFromAPI(userID)
            if err != nil {
                results <- err
                return
            }
            
            if err := updateDatabase(userData); err != nil {
                results <- err
            }
        }(id)
    }
    
    wg.Wait()
    close(results)
    
    return nil
}

The result: 10,000 users synced in 15 seconds. A 50x improvement.

I'd used goroutines for the first time, and they changed how I thought about concurrency. This article covers everything I learned.

What Are Goroutines?

A goroutine is a lightweight thread managed by the Go runtime. Unlike OS threads:

Lightweight: Start with 2KB of stack (vs 1-2MB for OS threads)
Cheap: Can run millions of goroutines on a modest machine
Multiplexed: Go runtime schedules goroutines across OS threads

Starting a Goroutine

func main() {
    // Regular function call
    doWork()
    
    // Goroutine - runs concurrently
    go doWork()
    
    // Multiple goroutines
    go doWork()
    go doWork()
    go doWork()
    
    // Main goroutine exits immediately!
    // Need to wait for goroutines to finish
}

Critical: The main function is itself a goroutine. If it exits, all other goroutines are terminated.

Basic Example

package main

import (
    "fmt"
    "time"
)

func say(s string) {
    for i := 0; i < 3; i++ {
        fmt.Println(s)
        time.Sleep(100 * time.Millisecond)
    }
}

func main() {
    go say("Hello")
    go say("World")
    
    // Wait for goroutines to finish
    time.Sleep(400 * time.Millisecond)
    fmt.Println("Done")
}

Output (order may vary):

Hello
World
Hello
World
Hello
World
Done

Channels: Communicating Between Goroutines

Channels are Go's way of letting goroutines communicate safely. They're typed and synchronized.

Creating and Using Channels

// Create a channel
ch := make(chan int)

// Send to channel (blocks until someone receives)
ch <- 42

// Receive from channel (blocks until someone sends)
value := <-ch

// Receive and ignore value
<-ch

Simple Example

func main() {
    ch := make(chan string)
    
    go func() {
        ch <- "Hello from goroutine"
    }()
    
    msg := <-ch
    fmt.Println(msg)  // Hello from goroutine
}

Real Example: Concurrent URL Checker

package main

import (
    "fmt"
    "net/http"
    "time"
)

type Result struct {
    URL    string
    Status int
    Err    error
}

func checkURL(url string, results chan<- Result) {
    start := time.Now()
    resp, err := http.Get(url)
    
    if err != nil {
        results <- Result{URL: url, Err: err}
        return
    }
    defer resp.Body.Close()
    
    results <- Result{
        URL:    url,
        Status: resp.StatusCode,
    }
    
    fmt.Printf("%s - %d - took %v\n", url, resp.StatusCode, time.Since(start))
}

func main() {
    urls := []string{
        "https://google.com",
        "https://github.com",
        "https://stackoverflow.com",
        "https://reddit.com",
        "https://twitter.com",
    }
    
    results := make(chan Result, len(urls))
    
    // Launch goroutines
    for _, url := range urls {
        go checkURL(url, results)
    }
    
    // Collect results
    for i := 0; i < len(urls); i++ {
        result := <-results
        if result.Err != nil {
            fmt.Printf("ERROR: %s - %v\n", result.URL, result.Err)
        }
    }
}

Buffered vs Unbuffered Channels

Unbuffered Channels

Default behavior - sends block until someone receives:

ch := make(chan int)

// This would deadlock - no receiver!
// ch <- 42

// Need a goroutine to receive
go func() {
    value := <-ch
    fmt.Println(value)
}()

ch <- 42  // Now works - receiver is ready

Buffered Channels

Can hold a limited number of values without blocking:

ch := make(chan int, 3)  // Buffer size 3

// These don't block - buffer has space
ch <- 1
ch <- 2
ch <- 3

// This would block - buffer is full
// ch <- 4

// Receive to make space
fmt.Println(<-ch)  // 1

// Now we can send again
ch <- 4

When to use buffered channels:

Producer/consumer with different speeds
Batch processing
Preventing goroutine blocking

Channel Directions

You can restrict channels to send-only or receive-only:

// Send-only channel
func producer(ch chan<- int) {
    ch <- 42
    // value := <-ch  // Compile error!
}

// Receive-only channel
func consumer(ch <-chan int) {
    value := <-ch
    // ch <- 42  // Compile error!
}

// Bidirectional channel
func main() {
    ch := make(chan int, 1)
    
    producer(ch)  // Implicitly converts to send-only
    consumer(ch)  // Implicitly converts to receive-only
}

This enforces correct usage at compile time.

Closing Channels

ch := make(chan int, 2)

ch <- 1
ch <- 2
close(ch)  // Signal no more values

// Receiving from closed channel returns zero value
fmt.Println(<-ch)  // 1
fmt.Println(<-ch)  // 2
fmt.Println(<-ch)  // 0 (zero value)

// Sending to closed channel panics
// ch <- 3  // panic: send on closed channel

Checking if Channel is Closed

value, ok := <-ch

if ok {
    fmt.Println("Received:", value)
} else {
    fmt.Println("Channel closed")
}

Ranging Over Channels

func producer(ch chan<- int) {
    for i := 0; i < 5; i++ {
        ch <- i
    }
    close(ch)  // Important! Signals end
}

func main() {
    ch := make(chan int)
    
    go producer(ch)
    
    // Receives until channel is closed
    for value := range ch {
        fmt.Println(value)
    }
}

Critical: If you don't close the channel, the range loop will deadlock waiting for more values.

The Select Statement

select lets you wait on multiple channel operations:

func main() {
    ch1 := make(chan string)
    ch2 := make(chan string)
    
    go func() {
        time.Sleep(1 * time.Second)
        ch1 <- "one"
    }()
    
    go func() {
        time.Sleep(2 * time.Second)
        ch2 <- "two"
    }()
    
    for i := 0; i < 2; i++ {
        select {
        case msg1 := <-ch1:
            fmt.Println("Received", msg1)
        case msg2 := <-ch2:
            fmt.Println("Received", msg2)
        }
    }
}

Select with Default

Non-blocking select:

select {
case msg := <-ch:
    fmt.Println("Received", msg)
default:
    fmt.Println("No message received")
}

Select with Timeout

select {
case result := <-ch:
    fmt.Println("Received", result)
case <-time.After(1 * time.Second):
    fmt.Println("Timeout!")
}

Real Example: Worker with Timeout

func worker(jobs <-chan int, results chan<- int, timeout time.Duration) {
    for job := range jobs {
        select {
        case results <- processJob(job):
            // Job completed
        case <-time.After(timeout):
            fmt.Printf("Job %d timed out\n", job)
            results <- -1
        }
    }
}

func processJob(job int) int {
    // Simulate work
    time.Sleep(time.Duration(job) * time.Millisecond)
    return job * 2
}

Real Example: Concurrent Data Processor

This is similar to what I built for the user sync service:

package main

import (
    "fmt"
    "sync"
    "time"
)

type User struct {
    ID   int
    Name string
}

func fetchUser(id int) (*User, error) {
    // Simulate API call
    time.Sleep(50 * time.Millisecond)
    return &User{ID: id, Name: fmt.Sprintf("User %d", id)}, nil
}

func processUsers(userIDs []int, concurrency int) error {
    var wg sync.WaitGroup
    jobs := make(chan int, len(userIDs))
    results := make(chan *User, len(userIDs))
    errors := make(chan error, len(userIDs))
    
    // Start workers
    for i := 0; i < concurrency; i++ {
        wg.Add(1)
        go func(workerID int) {
            defer wg.Done()
            
            for id := range jobs {
                user, err := fetchUser(id)
                if err != nil {
                    errors <- err
                    continue
                }
                results <- user
            }
        }(i)
    }
    
    // Send jobs
    for _, id := range userIDs {
        jobs <- id
    }
    close(jobs)
    
    // Wait for all workers
    wg.Wait()
    close(results)
    close(errors)
    
    // Process results
    for user := range results {
        fmt.Printf("Processed: %s\n", user.Name)
    }
    
    // Check for errors
    for err := range errors {
        fmt.Printf("Error: %v\n", err)
    }
    
    return nil
}

func main() {
    userIDs := make([]int, 100)
    for i := range userIDs {
        userIDs[i] = i + 1
    }
    
    start := time.Now()
    processUsers(userIDs, 10)  // 10 concurrent workers
    fmt.Printf("Processed 100 users in %v\n", time.Since(start))
}

Common Pitfalls and How to Avoid Them

1. Goroutine Leaks

Problem: Goroutines waiting on channels that never get data:

// BAD: Goroutine leak
func leak() {
    ch := make(chan int)
    
    go func() {
        value := <-ch  // Waits forever!
        fmt.Println(value)
    }()
    
    // Never send anything to ch
    // Goroutine leaks
}

Solution: Always ensure goroutines have an exit path:

// GOOD: Use context or done channel
func noLeak(ctx context.Context) {
    ch := make(chan int)
    
    go func() {
        select {
        case value := <-ch:
            fmt.Println(value)
        case <-ctx.Done():
            return  // Exit goroutine
        }
    }()
}

2. Channel Deadlocks

Problem: All goroutines are blocked waiting:

// BAD: Deadlock
func deadlock() {
    ch := make(chan int)
    ch <- 42  // No one to receive - deadlock!
}

Solution: Use buffered channel or goroutine:

// GOOD: Buffered channel
func noDeadlock1() {
    ch := make(chan int, 1)
    ch <- 42
    fmt.Println(<-ch)
}

// GOOD: Goroutine
func noDeadlock2() {
    ch := make(chan int)
    go func() {
        ch <- 42
    }()
    fmt.Println(<-ch)
}

3. Closing Channels Multiple Times

Problem: Closing an already-closed channel panics:

// BAD: Panic on second close
ch := make(chan int)
close(ch)
close(ch)  // panic!

Solution: Only the sender should close:

// GOOD: Single closer
func producer(ch chan<- int) {
    defer close(ch)  // Sender closes
    
    for i := 0; i < 10; i++ {
        ch <- i
    }
}

4. Sending to Closed Channel

Problem: Sending to a closed channel panics:

ch := make(chan int, 1)
close(ch)
ch <- 42  // panic: send on closed channel

Solution: Don't send after closing, or use recover:

func safeSend(ch chan int, value int) (closed bool) {
    defer func() {
        if recover() != nil {
            closed = true
        }
    }()
    
    ch <- value
    return false
}

5. Data Races

Problem: Accessing shared data without synchronization:

// BAD: Data race
var counter int

func increment() {
    for i := 0; i < 1000; i++ {
        counter++  // NOT safe!
    }
}

func main() {
    go increment()
    go increment()
    time.Sleep(1 * time.Second)
    fmt.Println(counter)  // Unpredictable result
}

Solution: Use channels or sync primitives:

// GOOD: Using channels
func incrementSafe() {
    counter := 0
    ch := make(chan int)
    
    worker := func() {
        for i := 0; i < 1000; i++ {
            ch <- 1
        }
    }
    
    go worker()
    go worker()
    
    go func() {
        for inc := range ch {
            counter += inc
        }
    }()
}

Your Challenge

Build a concurrent web scraper:

// Fetch multiple URLs concurrently
// Requirements:
// 1. Process 20 URLs with 5 concurrent workers
// 2. Return results with status codes
// 3. Handle timeouts (2 seconds per request)
// 4. Collect errors without stopping
// 5. Print summary when done

type ScrapeResult struct {
    URL        string
    StatusCode int
    Duration   time.Duration
    Err        error
}

func scrapeURLs(urls []string, concurrency int, timeout time.Duration) []ScrapeResult {
    // Your implementation here
}

Key Takeaways

Goroutines are cheap: Can run thousands concurrently
Channels synchronize: Safe communication between goroutines
Buffered vs unbuffered: Buffered channels prevent blocking
Channel directions: Enforce send-only or receive-only at compile time
Close channels: Sender closes to signal completion
Select statement: Wait on multiple channels
Avoid leaks: Always provide exit paths for goroutines
Detect races: Use go run -race to find data races

What I Learned

That user sync rewrite taught me that concurrency doesn't have to be hard:

Goroutines made parallelism trivial - no thread pools, no complexity
Channels eliminated shared state bugs - data races disappeared
15-second sync time saved hours of processing weekly
Go's runtime handled the hard parts - scheduling, multiplexing

Coming from Python's threading/multiprocessing pain, Go's concurrency felt magical. But it's not magic - it's thoughtful design. The sync service has been running for 2 years now, processing millions of records without a single concurrency bug.

The 50x speedup was nice. The zero-bug record was better.

Next: Concurrency Patterns

In the next article, we'll explore advanced concurrency patterns: worker pools, fan-out/fan-in, pipelines, and the context package. You'll learn the patterns that turned Go into the language of choice for high-performance systems.

PreviousError Handling in Go NextConcurrency Patterns

Last updated 1 month ago

hashtagThe Microservice That Processed 10,000 Records in 12 Minutes

hashtagWhat Are Goroutines?

hashtagStarting a Goroutine

hashtagBasic Example

hashtagChannels: Communicating Between Goroutines

hashtagCreating and Using Channels

hashtagSimple Example

hashtagReal Example: Concurrent URL Checker

hashtagBuffered vs Unbuffered Channels

hashtagUnbuffered Channels

hashtagBuffered Channels

hashtagChannel Directions

hashtagClosing Channels

hashtagChecking if Channel is Closed

hashtagRanging Over Channels

hashtagThe Select Statement

hashtagSelect with Default

hashtagSelect with Timeout

hashtagReal Example: Worker with Timeout

hashtagReal Example: Concurrent Data Processor

hashtagCommon Pitfalls and How to Avoid Them

hashtag1. Goroutine Leaks

hashtag2. Channel Deadlocks

hashtag3. Closing Channels Multiple Times

hashtag4. Sending to Closed Channel

hashtag5. Data Races

hashtagYour Challenge

hashtagKey Takeaways

hashtagWhat I Learned