Learn Elasticsearch: Mapping and Analyzers Explained

Mapping and text analysis are fundamental concepts in Elasticsearch that determine how your data is processed, stored, and searched. In this article, we’ll explore how to define mappings and configure analyzers for optimal search results.

Introduction

Elasticsearch’s power comes from its ability to understand and process data intelligently. This intelligence is configured through mappings and analyzers, which define how your data is interpreted and indexed.

Understanding Mappings

Mappings define the schema for your documents, specifying how fields should be stored and indexed.

Basic Mapping Types

PUT /products
{
  "mappings": {
    "properties": {
      "id": { "type": "keyword" },
      "name": { "type": "text" },
      "description": { "type": "text" },
      "price": { "type": "float" },
      "created_at": { "type": "date" },
      "tags": { "type": "keyword" },
      "in_stock": { "type": "boolean" }
    }
  }
}

Field Data Types

Text Fields

PUT /articles
{
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword",
            "ignore_above": 256
          }
        }
      }
    }
  }
}

Numeric Fields

PUT /metrics
{
  "mappings": {
    "properties": {
      "cpu_usage": { "type": "float" },
      "memory_total": { "type": "long" },
      "request_count": { "type": "integer" },
      "price": { "type": "scaled_float", "scaling_factor": 100 }
    }
  }
}

Text Analysis Components

Analyzers

An analyzer consists of three components:

Character filters
Tokenizer
Token filters

PUT /my_index
{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_custom_analyzer": {
          "type": "custom",
          "char_filter": ["html_strip"],
          "tokenizer": "standard",
          "filter": ["lowercase", "stop", "snowball"]
        }
      }
    }
  }
}

Built-in Analyzers

Standard Analyzer

POST _analyze
{
  "analyzer": "standard",
  "text": "The quick brown fox jumps over the lazy dog!"
}

Simple Analyzer

POST _analyze
{
  "analyzer": "simple",
  "text": "The QUICK brown FOX!"
}

Custom Analyzers

PUT /blog_posts
{
  "settings": {
    "analysis": {
      "char_filter": {
        "emoticons": {
          "type": "mapping",
          "mappings": [
            ":) => happy",
            ":( => sad"
          ]
        }
      },
      "analyzer": {
        "blog_analyzer": {
          "type": "custom",
          "char_filter": ["emoticons", "html_strip"],
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "stop",
            "asciifolding"
          ]
        }
      }
    }
  }
}

Tokenizers

Types of Tokenizers

Standard Tokenizer

POST _analyze
{
  "tokenizer": "standard",
  "text": "The quick.brown_fox jumped!"
}

N-gram Tokenizer

PUT /autocomplete
{
  "settings": {
    "analysis": {
      "analyzer": {
        "autocomplete": {
          "tokenizer": "ngram",
          "min_gram": 2,
          "max_gram": 10
        }
      }
    }
  }
}

Advanced Mapping Features

Multi-fields

PUT /users
{
  "mappings": {
    "properties": {
      "name": {
        "type": "text",
        "fields": {
          "raw": {
            "type": "keyword"
          },
          "english": {
            "type": "text",
            "analyzer": "english"
          }
        }
      }
    }
  }
}

Dynamic Mapping

PUT /dynamic_index
{
  "mappings": {
    "dynamic": "strict",
    "properties": {
      "title": { "type": "text" },
      "created": { "type": "date" },
      "metadata": {
        "type": "object",
        "dynamic": true
      }
    }
  }
}

Mapping Parameters

Common Parameters

PUT /products
{
  "mappings": {
    "properties": {
      "description": {
        "type": "text",
        "index": true,
        "store": false,
        "doc_values": true,
        "null_value": "N/A",
        "copy_to": "all_fields"
      }
    }
  }
}

Use Cases and Examples

Search Optimization

Product Search

PUT /e-commerce
{
  "mappings": {
    "properties": {
      "product_name": {
        "type": "text",
        "analyzer": "english",
        "fields": {
          "exact": {
            "type": "keyword"
          },
          "suggest": {
            "type": "text",
            "analyzer": "simple"
          }
        }
      }
    }
  }
}

Multi-language Support

PUT /multi_lang_posts
{
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "fields": {
          "english": {
            "type": "text",
            "analyzer": "english"
          },
          "spanish": {
            "type": "text",
            "analyzer": "spanish"
          }
        }
      }
    }
  }
}

Best Practices

Mapping Design
- Plan your mapping before indexing
- Use appropriate field types
- Consider future requirements
Analyzer Selection
- Choose analyzers based on use case
- Test analyzer output
- Consider language-specific needs
Performance Optimization
- Use doc_values appropriately
- Limit number of fields
- Monitor mapping size

Common Issues and Solutions

Mapping Explosion

Set limits on field count
Use nested objects carefully
Monitor mapping size

Analysis Issues

Test analyzers before deployment
Use appropriate tokenizers
Consider edge cases

Next Steps

After understanding mapping and analyzers:

Learn about search templates
Explore index aliases
Implement reindexing strategies
Master index lifecycle management

Conclusion

Proper mapping and text analysis are crucial for:

Accurate search results
Efficient data storage
Optimal performance
Flexible querying

Remember to:

Plan your mappings carefully
Test analyzers thoroughly
Monitor performance
Update mappings when needed

Stay tuned for our next article on search templates and index patterns in Elasticsearch.

Introduction#

Understanding Mappings#

Basic Mapping Types#

Field Data Types#

Text Analysis Components#

Analyzers#

Built-in Analyzers#

Custom Analyzers#

Tokenizers#

Types of Tokenizers#

Advanced Mapping Features#

Multi-fields#

Dynamic Mapping#

Mapping Parameters#

Common Parameters#

Use Cases and Examples#

Search Optimization#

Best Practices#

Common Issues and Solutions#

Mapping Explosion#

Analysis Issues#

Next Steps#

Conclusion#