ElasticSearch Cookbook
上QQ阅读APP看书,第一时间看更新

Mapping a multifield

Often, a field must be processed with several core types or in different ways. For example, a string field must be processed as analyzed for search and as not_analyzed for sorting. To do this, we need to define a multifield.

Multifield is a very powerful feature of mapping, because it allows the use of the same field in different ways.

Getting ready

You need a working ElasticSearch cluster.

How to do it...

To define a multifield we need to do the following:

  1. Use multi_field as type.
  2. Define a dictionary containing the subfields called fields. The subfield with the same name of parent field is the default one.

If we consider the item of our order example, we can index the name as multi_field as shown in the following code:

"name": {
  "type": "multi_field",
  "fields": {
    "name": {
      "type": "string",
      "index": "not_analyzed"
    },
  "tk": {
    "type": "string",
    "index": "analyzed"
  },
  "code": {
    "type": "string",
    "index": "analyzed",
    "analyzer": "code_analyzer"
  }
  }
},

If we already have a mapping stored in ElasticSearch, and if we want to upgrade the field in a multifield, it's enough to save a new mapping with a different type and ElasticSearch provides automatic merging.

How it works...

During indexing, when ElasticSearch processes a type field as multi_field, it reprocesses the same field for every subfield defined in the mapping.

To access the subfields of multi_field, we have a new path built on the base field plus the subfield name. If we consider the preceding example, we have:

  • name: This points to default multifield subfield (the not_analyzed one)
  • name.tk: This points to the standard analyzed (tokenized) field
  • name.code: This points to a field analyzed with a code extractor analyzer

If you notice in the preceding example, we have changed the analyzer to introduce a code extractor analyzer that allows extraction of the item code from a string.

Using the multifield if we index a string, such as "Good item to buy - ABC1234", we'll have:

  • name = "Good item to buy - ABC1234" (useful for sorting)
  • name.tk=["good", "item", "to", "buy", "abc1234"] (useful for searching)
  • name.code = ["ABC1234"] (useful for searching and faceting)

There's more...

MultiField is very useful in data processing, because it allows you to define several ways to process a field data.

For example, if we are working for document content, we can define them as subfield analyzers to extract names, places, date/time, geo location, and so on. The fields of a multifield are standard core type fields; we can do every process we want on them, such as search, filter, facet, and scripting.

See also

  • Mapping different analyzers