Part 2: Elasticsearch - Search and Analytics Engine

Part of the ELK Stack 101 Series

My First Elasticsearch Query

I'll never forget the moment Elasticsearch clicked for me. I had just migrated a month's worth of application logs - 50 million documents. Using traditional grep on log files, finding a specific error took 10+ minutes.

In Elasticsearch, I typed:

GET /logs-*/_search
{
  "query": {
    "match": { "error.message": "payment timeout" }
  }
}

Response time: 47 milliseconds. Across 50 million documents. That's when I realized the power of Elasticsearch.

In this article, I'll share everything I learned about Elasticsearch - from installation to advanced querying, based on real projects and actual production usage.

What is Elasticsearch Really?

Beyond the marketing buzzwords, here's what I understand Elasticsearch to be:

Elasticsearch is a distributed search and analytics engine built on Apache Lucene. Think of it as:

  • A NoSQL document database (stores JSON)

  • A full-text search engine (like Google for your data)

  • An analytics platform (aggregations, statistics)

  • A distributed system (scales horizontally)

Written in Java, runs on the JVM, and exposes everything via RESTful HTTP APIs.

Installing Elasticsearch

I've installed Elasticsearch many ways. Here are the approaches I use:

Method 1: Docker (My Favorite for Development)

Single node for testing:

Verify it's running:

That tagline never gets old.

Method 2: Docker Compose (Multi-Node Development)

docker-compose.yml:

Start cluster:

Check cluster health:

Method 3: Linux Installation (Production)

On Ubuntu/Debian:

Configuration file: /etc/elasticsearch/elasticsearch.yml

Core Elasticsearch Concepts

Let me explain the concepts that took me a while to grasp.

1. Documents and Indices

Document = A single JSON record (like a row in a database)

Index = Collection of similar documents (like a table)

An index is where documents live. I create time-based indices:

2. Mappings (Schema)

Mapping defines the structure of documents - field types, analyzers, etc.

My typical log mapping:

Field types I use:

  • keyword: Exact match, aggregations, sorting (log level, user ID)

  • text: Full-text search, analyzed (error messages)

  • date: Timestamps, date range queries

  • integer/long: Numbers (response time, counts)

  • boolean: True/false flags

  • ip: IP addresses (with range search support)

  • geo_point: Geographic coordinates

Key lesson: Choose field types carefully. Can't aggregate on text fields, can't full-text search keyword fields.

3. Shards and Replicas

Shard = Subset of an index, allows horizontal scaling

An index is divided into shards. Each shard is a self-contained Lucene index.

Example: Index with 3 primary shards

Replica = Backup copy of a shard

My shard strategy:

For small indices (< 10GB): 1 primary shard For medium indices (10-100GB): 3-5 primary shards For large indices (> 100GB): 5-10 primary shards

Replicas: Always 1+ replica in production (for redundancy)

Setting shards and replicas:

Important: Can't change primary shard count after index creation. Choose wisely.

4. Nodes and Clusters

Node = Single Elasticsearch instance (one Java process)

Cluster = Group of nodes working together

Node types I configure:

Master node: Manages cluster state, creates/deletes indices

Data node: Stores data, executes queries

Coordinating node: Routes requests, merges results (no data, not master)

My typical 5-node cluster:

  • 3 master-eligible nodes (HA for cluster state)

  • 5 data nodes (distribute data)

  • 2 coordinating nodes (dedicated query routers)

Indexing Data

Let me show you the different ways I index data into Elasticsearch.

Method 1: Single Document via REST API

Method 2: Bulk API (High Throughput)

For indexing many documents efficiently:

Performance: Can index 10,000+ documents per second per node.

My bulk indexing script (Python):

Method 3: Via Logstash or Beats

This is how I actually do it in production - covered in Parts 3 and later.

Searching Data

Here's where Elasticsearch shines. Let me show you query patterns I use daily.

Query Syntax Options

1. URI Search (Quick and Dirty)

2. Query DSL (Powerful, My Preference)

3. Kibana Query Language (KQL) in Kibana UI

Common Query Types

Searches analyzed text fields:

Finds documents containing "database", "connection", or "failed" (OR by default).

Term Query (Exact Match)

For keyword fields:

Range Query

For dates, numbers:

Or

Bool Query (Combine Multiple Conditions)

My most-used query type:

Breakdown:

  • must: Document MUST match (affects scoring)

  • filter: Document MUST match (no scoring, faster, cacheable)

  • must_not: Document MUST NOT match

  • should: Document SHOULD match (increases score if it does)

Use filter for exact matches, must for full-text search.

Practical Search Examples

Example 1: Find Errors in Last Hour

Example 2: Slow API Requests

Example 3: Search Across Multiple Fields

Example 4: Wildcard and Regex

Warning: Wildcards and regex can be slow. Use sparingly.

Aggregations (Analytics)

Aggregations are how I generate statistics, metrics, and insights.

Metric Aggregations

Count of Documents

Average, Min, Max, Sum

Percentiles

Bucket Aggregations

Terms Aggregation (Group By)

Count logs by level:

Response:

Date Histogram (Time Series)

Logs per hour:

Range Aggregation

Group response times into buckets:

Nested Aggregations

Errors by service, then by hour:

This is how I build dashboards - nested aggregations for multi-dimensional analysis.

Index Templates

Index templates automatically apply settings and mappings to new indices.

My logs template:

Now every index matching logs-* gets these settings automatically.

Index Lifecycle Management (ILM)

ILM automates index lifecycle - from creation to deletion.

My logs ILM policy:

What this does:

  • Hot phase: Keep actively writing until index is 1 day old or 50GB

  • Delete phase: Delete indices older than 30 days

Saves storage, maintains performance.

Performance Optimization

Lessons I learned the hard way.

1. Use Filter Context When Possible

Slow (scoring overhead):

Fast (no scoring, cacheable):

2. Limit Result Size

Don't do this:

Do this:

For large result sets, use scroll API or search_after.

3. Use Index Patterns Wisely

Slow (searches all indices):

Fast (searches specific date range):

4. Bulk Indexing Best Practices

Optimal bulk size: 5-15 MB per request Parallel bulk requests: 2-4 per node Refresh interval: Increase during bulk indexing

After bulk indexing, reset to default:

5. Mapping Optimization

Disable _source for metrics (if you don't need original document):

Set ignore_above for long strings:

Useful Elasticsearch APIs

Cluster Health

Node Stats

Index Stats

Cat APIs (Human-Readable)

Index Management

Common Issues and Solutions

Issue 1: Unassigned Shards

Problem: Yellow/red cluster, shards not assigned

Check:

Solution: Usually not enough nodes for replicas

Issue 2: Slow Queries

Check slow logs:

Enable slow query logging:

Check logs: /var/log/elasticsearch/[cluster-name]_index_search_slowlog.log

Issue 3: Out of Memory

Check heap usage:

Solution: Increase heap (up to 50% of RAM, max 32GB)

Edit /etc/elasticsearch/jvm.options:

Issue 4: Disk Space

Check disk usage:

Solution: Delete old indices, increase disk, or implement ILM

Conclusion

Elasticsearch is the core of the ELK stack - the engine that makes everything work. Key takeaways:

Core concepts:

  • Documents and indices (data structure)

  • Mappings (schema definition)

  • Shards and replicas (distribution and redundancy)

  • Nodes and clusters (scaling)

Key operations:

  • Indexing (single, bulk)

  • Searching (match, term, bool, range)

  • Aggregations (metrics, buckets, nested)

  • Index templates and ILM

Performance:

  • Use filter context

  • Limit result sizes

  • Optimize mappings

  • Bulk index efficiently

In the next article, we'll explore Logstash - the data processing pipeline that feeds Elasticsearch.

Previous: Part 1 - Introduction to ELK Stack Next: Part 3 - Logstash Pipeline


This article is part of the ELK Stack 101 series. Check out the series overview for more content.

Last updated