Building a high-performance Elasticsearch cluster requires careful planning and optimization. In this article, we’ll explore essential strategies for sharding, replication, index management, and cluster optimization to ensure your Elasticsearch deployment is both fast and scalable.

Introduction

Performance optimization in Elasticsearch involves multiple aspects, from hardware configuration to index design and query optimization. Understanding these components and their interactions is crucial for building efficient search solutions.

Cluster Architecture

Node Types and Roles

PUT _cluster/settings
{
  "persistent": {
    "node.roles": ["data", "ingest", "master"]
  }
}
  1. Master Nodes

    • Dedicated nodes for cluster management
    • Minimum 3 nodes for high availability
    • Moderate CPU and memory
  2. Data Nodes

    • High memory and storage capacity
    • Multiple nodes for horizontal scaling
    • Fast storage (SSD recommended)
  3. Ingest Nodes

    • High CPU for data processing
    • Moderate memory requirements
    • Optional for small clusters

Sharding Strategy

Shard Size Optimization

PUT /logs-2024.04.06
{
  "settings": {
    "number_of_shards": 5,
    "number_of_replicas": 1,
    "index.routing.allocation.total_shards_per_node": 3
  }
}

Best Practices for Sharding

  1. Shard Size

    • Target 20-50GB per shard
    • Monitor shard size growth
    • Adjust based on data volume
  2. Shard Count

    • Consider node capacity
    • Plan for future growth
    • Balance query performance

Replication Strategy

Replica Configuration

PUT /products
{
  "settings": {
    "number_of_replicas": 2,
    "index.routing.allocation.include._tier_preference": "data_hot,data_warm,data_cold"
  }
}

Replication Factors

  • Production: 1-2 replicas
  • High availability: 2-3 replicas
  • Disaster recovery: 3+ replicas

Index Lifecycle Management (ILM)

ILM Policy Example

PUT _ilm/policy/logs_policy
{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover": {
            "max_size": "50gb",
            "max_age": "7d"
          }
        }
      },
      "warm": {
        "min_age": "7d",
        "actions": {
          "forcemerge": {
            "max_num_segments": 1
          },
          "shrink": {
            "number_of_shards": 1
          }
        }
      },
      "cold": {
        "min_age": "30d",
        "actions": {
          "freeze": {}
        }
      },
      "delete": {
        "min_age": "90d",
        "actions": {
          "delete": {}
        }
      }
    }
  }
}

Index Settings Optimization

Performance Settings

PUT /high_performance_index
{
  "settings": {
    "index.refresh_interval": "30s",
    "index.translog.durability": "async",
    "index.translog.sync_interval": "5s",
    "index.merge.scheduler.max_thread_count": 1,
    "index.number_of_routing_shards": 30
  }
}

Memory Management

PUT _cluster/settings
{
  "persistent": {
    "indices.breaker.total.limit": "70%",
    "indices.breaker.fielddata.limit": "40%",
    "indices.breaker.request.limit": "60%"
  }
}

Query Optimization

Search Settings

PUT /search_optimized
{
  "settings": {
    "index.search.slowlog.threshold.query.warn": "10s",
    "index.search.slowlog.threshold.query.info": "5s",
    "index.search.slowlog.threshold.query.debug": "2s",
    "index.search.slowlog.threshold.query.trace": "500ms"
  }
}

Query Performance Tips

  1. Filter Context

    GET /products/_search
    {
      "query": {
        "bool": {
          "filter": [
            { "term": { "status": "active" } },
            { "range": { "price": { "gte": 100 } } }
          ]
        }
      }
    }
    
  2. Pagination Optimization

    GET /products/_search
    {
      "from": 0,
      "size": 10,
      "track_total_hits": false
    }
    

Monitoring and Maintenance

Cluster Health Monitoring

GET _cluster/health?pretty
GET _nodes/stats?pretty
GET _cat/indices?v

Regular Maintenance Tasks

  1. Force Merge

    POST /logs-*/_forcemerge?max_num_segments=1
    
  2. Cache Clear

    POST /_cache/clear
    

Scaling Strategies

Horizontal Scaling

  1. Add Data Nodes

    • Monitor shard distribution
    • Balance cluster load
    • Update replica settings
  2. Split Indices

    POST /large_index/_split/split_index
    {
      "settings": {
        "index.number_of_shards": 10
      }
    }
    

Vertical Scaling

  1. Memory Optimization

    • JVM heap settings
    • Field data cache
    • Query cache
  2. Storage Optimization

    • Use SSDs
    • RAID configuration
    • File system settings

Best Practices

  1. Index Design

    • Use appropriate mappings
    • Optimize field types
    • Consider data lifecycle
  2. Query Design

    • Use filters effectively
    • Optimize pagination
    • Monitor slow queries
  3. Cluster Management

    • Regular monitoring
    • Capacity planning
    • Backup strategies

Common Issues and Solutions

Performance Problems

  • Monitor slow logs
  • Check resource usage
  • Optimize queries

Scaling Issues

  • Review shard strategy
  • Adjust node roles
  • Update cluster settings

Next Steps

After optimizing performance:

  1. Implement monitoring
  2. Set up alerts
  3. Plan for growth
  4. Document procedures

Conclusion

Building a high-performance Elasticsearch cluster requires:

  • Proper sharding strategy
  • Effective replication
  • Optimized index settings
  • Regular maintenance

Remember to:

  • Monitor performance metrics
  • Plan for scalability
  • Follow best practices
  • Document configurations

Stay tuned for our next article on Elasticsearch security and access control.