Guides / Personalization / Advanced Personalization / Configure personalization / Prerequisites

Jun 16, 2025

Prepare your index structure

Use categorical attributes

Categorical attributes are attributes with a fixed number of possible string values. Incorporating such attributes into your index structure categorizes your data into well-defined, smaller buckets. For example:

Copy
[
  { "objectID": "ID01", "title": "Chair", "color": "red", "categories": ["Furniture", "Outdoors"] },
  { "objectID": "ID02", "title": "Table", "color": "green", "categories": ["Furniture", "Outdoors"] },
  { "objectID": "ID03", "title": "Water bottle", "color": "blue", "categories": ["Outdoors"] },
  { "objectID": "ID04", "title": "Book", "color": "red", "categories": ["Books"] },
  { "objectID": "ID05", "title": "Headphones", "color": "green", "categories": ["Electronics"] },
  { "objectID": "ID06", "title": "Phone", "color": "blue", "categories": ["Electronics"] },
  { "objectID": "ID07", "title": "Television", "color": "red", "categories": ["Electronics"] },
  { "objectID": "ID08", "title": "Can", "color": "green", "categories": ["Cooking & Dining", "Kitchenware"] },
  { "objectID": "ID09", "title": "Bowl", "color": "blue", "categories": ["Cooking & Dining", "Kitchenware"] }
]

Good categorical attributes

An attribute like color with finite values like red, green, and blue is a good categorical attribute because it organizes your index into three distinct buckets. Other good categorical attributes include gender, brand, and categories.

Bad categorical attributes

Attributes that are unique for each record, such as objectID and title, are poor choices because they don’t offer any grouping into buckets. Other poor categorical attributes include description, sku, and price.

Avoid nesting attributes

Aim to organize data within records with a flat structure. Reserve deeper nesting for hierarchical facets.

Consider the following example of several nested keys:

Copy
{
  "key1": {
      "key2": {
          "key3": "value"  
        }
    }
}

To make it easier for Advanced Personalization to process the index, simplify the attribute-value pair:

Copy
{
  "key1_key2_key3": "value"
}

To optimize performance, Advanced Personalization limits the number of key-value pairs for nested attributes to 50. Adhering to this limit ensures efficient processing of your nested attributes. If you need to exceed these constraints, contact the Algolia support team.

Avoid mixing attribute types

Maintaining a consistent type for attributes is an important step in preparing your index structure.

Copy
[
  { "color": "red" },
  { "color": ["red", "blue"] },
  { "color": 250000000 }
]

Using this index as is would lead to unexpected results. This is because the attribute color can be a string, an array, or an integer.

A structure like this often indicates an underlying issue with your data. You must evaluate your index and ensure a consistent type for your attributes.

Copy
[
  { "color": "red" },
  { "color": "green" },
  { "color": "blue" }
]

Avoid mixing attributes from different domains

When preparing your index structure, you must ensure that attributes are relevant to a single domain.

For example, an index for articles shouldn’t contain attributes relevant to product information and vice versa.

Language is a common domain that could also lead to a mix of attributes within your index.

Copy
[
  { 
    "objectID": "MH001", 
    "color": "red", 
    "title": "Bag" 
  },
  { 
    "objectID": "MH002", 
    "color": "rouge", 
    "title": "Sac" 
  },
  { 
    "objectID": "MH001", 
    "color": "blue", 
    "title": "Chair" 
  }
]

The second record is out of place in this index because it has French language attributes. Fix this by reviewing the index to ensure attributes from different domains aren’t mixed.

Copy
[
  { 
    "objectID": "MH001", 
    "color": "red", 
    "title": "Bag" 
  },
  { 
    "objectID": "MH002", 
    "color": "red", 
    "title": "Bag" 
  },
  { 
    "objectID": "MH001", 
    "color": "blue", 
    "title": "Chair" 
  }
]

How Advanced Personalization validates attributes

Advanced Personalization prioritizes attributes that directly improve personalization, building user profiles on meaningful data. It also filters out attributes based on user interactions.

An attribute-value pair must show significant user interaction. It doesn’t set a fixed benchmark for checks but instead looks for a significant number of interactions. If Advanced Personalization randomly picks users from last month for the color:red attribute, it expects some users to have interacted with red products.
Discard attributes with minimal relevance. If an attribute applies to too few products, it’s often filtered out due to a lack of user interactions.
Discard attributes with excessive diversity. For example, Advanced Personalization might filter out an attribute like brand if it detects thousands of distinct brands across millions of products, yet no individual brand garners enough user interactions to be considered “important”.
Discard unusual attribute values. While attributes with numerous unique values aren’t discarded, Advanced Personalization doesn’t process the unusual values (for example, color:pink_with_brown_dots).

Configure personalization

Prepare your event implementation

Did you find this page helpful?