Prepare your index structure
On this page
To optimize personalization, you should configure your existing index’s structure to align with the requirements of the Advanced Personalization feature. A well-prepared index structure enhances this feature’s ability to deliver personalized search experiences for your website or app.
This feature isn’t available on every plan. Refer to your pricing plan to see if it’s included.
Use categorical attributes
Categorical attributes are attributes with a fixed number of possible string values. Incorporating such attributes into your index structure categorizes your data into well-defined, smaller buckets. For example:
1
2
3
4
5
6
7
8
9
10
11
[
{ "objectID": "ID01", "title": "Chair", "color": "red", "categories": ["Furniture", "Outdoors"] },
{ "objectID": "ID02", "title": "Table", "color": "green", "categories": ["Furniture", "Outdoors"] },
{ "objectID": "ID03", "title": "Water bottle", "color": "blue", "categories": ["Outdoors"] },
{ "objectID": "ID04", "title": "Book", "color": "red", "categories": ["Books"] },
{ "objectID": "ID05", "title": "Headphones", "color": "green", "categories": ["Electronics"] },
{ "objectID": "ID06", "title": "Phone", "color": "blue", "categories": ["Electronics"] },
{ "objectID": "ID07", "title": "Television", "color": "red", "categories": ["Electronics"] },
{ "objectID": "ID08", "title": "Can", "color": "green", "categories": ["Cooking & Dining", "Kitchenware"] },
{ "objectID": "ID09", "title": "Bowl", "color": "blue", "categories": ["Cooking & Dining", "Kitchenware"] }
]
Good categorical attributes
An attribute like color
with finite values like red, green, and blue is a good categorical attribute because it organizes your index into three distinct buckets.
Other good categorical attributes include gender
, brand
, and categories
.
Bad categorical attributes
Attributes that are unique for each record, such as objectID
and title
, are poor choices because they don’t offer any grouping into buckets.
Other poor categorical attributes include description
, sku
, and price
.
Avoid nesting attributes
Aim to organize data within records with a flat structure. Reserve deeper nesting for hierarchical facets.
Consider the following example of several nested keys:
1
2
3
4
5
6
7
{
"key1": {
"key2": {
"key3": "value"
}
}
}
To make it easier for Advanced Personalization to process the index, simplify the attribute-value pair:
1
2
3
{
"key1_key2_key3": "value"
}
To optimize performance, Advanced Personalization limits the number of key-value pairs for nested attributes to 50. Adhering to this limit ensures efficient processing of your nested attributes. If you need to exceed these constraints, contact the Algolia support team.
Avoid mixing attribute types
Maintaining a consistent type for attributes is an important step in preparing your index structure.
1
2
3
4
5
[
{ "color": "red" },
{ "color": ["red", "blue"] },
{ "color": 250000000 }
]
Using this index as is would lead to unexpected results.
This is because the attribute color
can be a string, an array, or an integer.
A structure like this often indicates an underlying issue with your data. You must evaluate your index and ensure a consistent type for your attributes.
1
2
3
4
5
[
{ "color": "red" },
{ "color": "green" },
{ "color": "blue" }
]
Avoid mixing attributes from different domains
When preparing your index structure, you must ensure that attributes are relevant to a single domain.
For example, an index for articles shouldn’t contain attributes relevant to product information and vice versa.
Language is a common domain that could also lead to a mix of attributes within your index.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
[
{
"objectID": "MH001",
"color": "red",
"title": "Bag"
},
{
"objectID": "MH002",
"color": "rouge",
"title": "Sac"
},
{
"objectID": "MH001",
"color": "blue",
"title": "Chair"
}
]
The second record is out of place in this index because it has French language attributes. Fix this by reviewing the index to ensure attributes from different domains aren’t mixed.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
[
{
"objectID": "MH001",
"color": "red",
"title": "Bag"
},
{
"objectID": "MH002",
"color": "red",
"title": "Bag"
},
{
"objectID": "MH001",
"color": "blue",
"title": "Chair"
}
]
How Advanced Personalization validates attributes
Advanced Personalization prioritizes attributes that directly improve personalization, building user profiles on meaningful data. It also filters out attributes based on user interactions.
-
An attribute-value pair must show significant user interaction. It doesn’t set a fixed benchmark for checks but instead looks for a significant number of interactions. If Advanced Personalization randomly picks users from last month for the
color:red
attribute, it expects some users to have interacted with red products. -
Discard attributes with minimal relevance. If an attribute applies to too few products, it’s often filtered out due to a lack of user interactions.
-
Discard attributes with excessive diversity. For example, Advanced Personalization might filter out an attribute like
brand
if it detects thousands of distinct brands across millions of products, yet no individual brand garners enough user interactions to be considered “important”. -
Discard unusual attribute values. While attributes with numerous unique values aren’t discarded, Advanced Personalization doesn’t process the unusual values (for example,
color:pink_with_brown_dots
).