Pop quiz! What’s a “product”?
Did this question stump you? Or did it seem…too easy? Either way, it’s something we should explore further because there isn’t a clear definition. You might have immediately thought of something like this:
product (noun): something produced; something marketed or sold as a commodity — from Merriam-Webster
And that’s a great start! But as developers and businesspeople, we start to run into quirky edge-cases that this definition doesn’t quite cover.
For example, one of my coworkers recently bought a pair of boots — these boots, in fact. If you click on that link, it’ll take you to the product page for these boots on the Walmart website. Now, if we assume that this page represents a product, let me ask you a few follow-up questions:
By our bland dictionary definition, the answer to all three of these would be yes, as they’re all distinct objects that were produced. However, they each come with some nuance:
Clearly, we need a better definition. Here’s one that I’d like to put forward:
product (noun): an item, service, or category of items or services, as distinguished from other items, services, or categories of items or services
This definition seems more fitting because when you really delve into the details, products are defined by being distinct from other products. We lump all of the shoe sizes into a single product because they’re all far more similar to each other than they are to anything else in Walmart’s product database. They just feel like versions of a single product instead of different products, because things that are actually other products are much different.
Here’s a visual example. When you see these products, with the distances between them roughly representing their differences, where do you draw the product boundaries?
Sensibly, you’d draw them here:
This has a lot of relevance to search, the field we’re experts in over here at Algolia. When developing the search experience for these products then, every company has to ask a few questions about what those boundaries will look like — for example, what attributes will our customers be searching for? The searchable products that we feed into Algolia need to be independent, so if folks are going to include their shoe size in the search query, then at least in this case, we need to treat all of those shoe sizes as separate products.
In the world of search, depending on business requirements, some of the differences between versions of a single product grow more relevant. Those clusters are no longer neat and tidy like I drew them before. Our visualized space can start to look more like this:
Now where do you draw the lines?
Let’s imagine a completely hypothetical company (let’s just call them Orange) that makes smartphones (let’s just call them jPhones). jPhones come in a lot of different versions:
Orange has put a lot of effort into making their product database robust, but they haven’t given us much to work with when it comes to search, which has a different set of requirements, as we discussed earlier. Now, they’ve tasked you and I with building a search index for the jPhones that takes those options into account. They’ve given us this initial JSON to modify:
[
{
"title": "jPhone",
"editions": [
"13",
"12",
"11"
],
"color": [
"Aluminum",
"Sparkly Green",
"Maroon",
"Moonlight",
"Blackout",
"Deep Sea Blue"
],
"sizes": [
"Regular",
"XL"
],
"storageSpace": [
"128GB",
"256GB",
"512GB"
]
}
]
Let’s start by going through each of the options one by one and see how they affect what ends up as product attributes and what ends up meriting its own product.
When folks search for jPhones, they’ll likely include the edition in their search. For example, they know that just searching jphone
will get them a very wide range of results, so they’ll search jphone 13
.
How does this affect our search index? Well, it’s unlikely that a user would search for something more specific than they’d expect the product listing to be. In other words, by searching for jphone 13
, they expect to get back a result for a jPhone 13, not a general jPhone result which allows them to choose the 13 later. So in our search index, we need to split out the editions into different searchable products. Now our JSON for our search index looks like this:
[
{
"title": "jPhone 13",
"color": [
"Aluminum",
"Sparkly Green",
"Maroon",
"Moonlight",
"Blackout",
"Deep Sea Blue"
],
"sizes": [
"Regular",
"XL"
],
"storageSpace": [
"128GB",
"256GB",
"512GB"
]
},
{
"title": "jPhone 13",
"color": [
"Aluminum",
"Sparkly Green",
"Maroon",
"Moonlight",
"Blackout",
"Deep Sea Blue"
],
"sizes": [
"Regular",
"XL"
],
"storageSpace": [
"128GB",
"256GB",
"512GB"
]
},
{
"title": "jPhone 12",
"color": [
"Aluminum",
"Sparkly Green",
"Maroon",
"Moonlight",
"Blackout",
"Deep Sea Blue"
],
"sizes": [
"Regular",
"XL"
],
"storageSpace": [
"128GB",
"256GB",
"512GB"
]
},
{
"title": "jPhone 11",
"color": [
"Aluminum",
"Sparkly Green",
"Maroon",
"Moonlight",
"Blackout",
"Deep Sea Blue"
],
"sizes": [
"Regular",
"XL"
],
"storageSpace": [
"128GB",
"256GB",
"512GB"
]
}
]
Onto the colors! This is a tricky one. The coworker with the boots and I spoke about this a bit, and he was on the fence about this one. What do you think it should be? Should each color be a separate product, or should they be options on one product? Maybe it’d be a good exercise to pause for a moment and ponder the ramifications of either choice before continuing.
Have your choice? The part that makes this tricky is that you could make an argument for either side. On one hand, the colors aren’t terribly important, and they don’t change the production or order fulfillment process that much. They’re arguably just cosmetic changes which modify the price slightly. On the other hand, the customer base has collectively valued some colors higher than others because of their rarity, and preferences will likely cause some potential customers to search specifically for sparkly green jphone 13
instead of the more generic option.
The latter argument seems more convincing, doesn’t it? Often, the best practice in search results is to be as specific as you would expect a reasonably specific consumer to be, and in this case, Orange might expect some significant amount of consumers to search specifically for their favorite color, or for one that the community perceives to be more valuable. Perhaps you may be reminded of a real-life company that sounds a bit like Orange (despite our intentions to create a completely fictional example) which does not implement this particular suggestion. As leading search experts in the field, we at Algolia would humbly submit that that particular company should A/B test a version of their search index with the different colors split out into different products and see what happens! If you’re watching and reading, we’d be willing to wager — you know who you are.
After splitting out the colors, here’s our new JSON (abbreviated, as this is getting too long for article form):
[
{
"title": "jPhone 13",
"color": "Aluminum",
"sizes": [
"Regular",
"XL"
],
"storageSpace": [
"128GB",
"256GB",
"512GB"
]
},
...
{
"title": "jPhone 13",
"color": "Deep Sea Blue",
"sizes": [
"Regular",
"XL"
],
"storageSpace": [
"128GB",
"256GB",
"512GB"
]
},
...
{
"title": "jPhone 13",
"color": "Maroon",
"sizes": [
"Regular",
"XL"
],
"storageSpace": [
"128GB",
"256GB",
"512GB"
]
},
...
{
"title": "jPhone 12",
"color": "Moonlight",
"sizes": [
"Regular",
"XL"
],
"storageSpace": [
"128GB",
"256GB",
"512GB"
]
}
]
In our fictional universe here, there are no other differences between Regular and XL jPhones other than their physical size, and we’ll imagine that since the production process is almost identical save for the larger shell, that holds for the price as well. In this case, do you feel like the size is enough of a significant factor when it comes to search to warrant creating separate search records for the differently sized jPhones?
Interestingly, we can make the same two opposing arguments about size as we can about color. On one hand, the sizes aren’t terribly important, and they don’t change the production or order fulfillment process much. They’re arguably just cosmetic changes, except this time they don’t modify the price. On the other hand, the customer base may search specifically for the XL option if it’s especially important to them (which it will be for some folks, like those with arthritis or poor eyesight).
Which of the arguments seems more convincing now? Doesn’t it feel a little more like a 50/50? Here’s an interesting take: could you sell a customer who searches explicitly for one of the options, another one of those options? It’s unlikely when we’re talking about editions (if someone searches for an old jPhone, they probably are set on that old one, perhaps for financial reasons) or about colors (they want that sparkly green jphone 13
). But with the sizes, it doesn’t seem as unreasonable to think we could sell an XL to someone who didn’t search explicitly for that, or a regularly-sized jPhone to someone who searched for XL. Perhaps we’d have a better performing product page if we gave the customer the option to choose the size on the page instead of creating independently-searchable pages for each of the sizes. For that reason, I’m leaning towards keeping our JSON as-is with the size as an attribute on the searchable product.
Maybe you’ve noticed, but we’re reviewing these details on the phones in decreasing order of relevance. Not that nobody cares about storage space, but are you ever really dead set on a particular storage capacity on your new phone? It’s not unreasonable for Orange to feel that storage space should simply be an option in the checkout flow instead of having dedicated searchable pages for each of the storage space options.
Recall our last argument on the different phone sizes: could we sell a 512 GB phone to someone who explicitly searched for 256 GB? Absolutely! This happens all the time, especially if the price difference is minimal. Even if it’s not a small price difference, we’d likely upsell more if we put those options side-by-side, even if someone specifically searched for a specific value. It turns out to be a benefit to stick a less strictly relevant, yet more relevant business-wise, result in the search rankings in this case.
In the end, as far as search is concerned, your products should be as specific and broken-down as possible without compromising business goals. Our logical reasoning produced a definitely yes, a probably yes, a probably no, and a definitely no for the four attributes of the hypothetical smartphones above, but those answers are absolutely not universal — a technology company like Orange is going to have wildly different business goals to keep in mind than, say, a company that makes automotive tools, where the search queries are often far more specific.
If you want to see us walk through this same process with the automotive tools company (let’s call them Clip-On) and prove that this framework holds, check out part two in this series!
Until then,
Antoine and the Algolia crew
Antoine Hemery
Senior Software EngineerPowered by Algolia AI Recommendations