Feature Image Synonym generation

How to generate Product Synonyms for query expansion and increased product coverage?

Why are product synonyms important?

The central task of information retrieval (IR) in eCommerce is to find products that satisfy the users’ needs. Words in a concise description of products might not cover product synonyms in all dialects in the language. These synonyms are important for delivering the right search results and increasing the effectiveness of eCommerce Site search.

Users typing the query might not be aware of the domain-specific terminology, and so the words in the query might not match the words mentioned in the product description. Chances of this are really high, which would eventually mean that shoppers will not be able to find products that are actually there in the catalog but didn’t show up because of poor synonym matching capabilities in your site search system.

Why domain-specific synonyms?

Say Bob wants to gift Alice a glitter dress for Christmas. He would visit an eCommerce site and type the query “glitter dress.” On the other hand, if all the glitter dresses contain a description as sequin (formal fashion word for glitter) dress, the eCommerce site wouldn’t be able to fetch those results due to *no match*. Bob leaves the site unsatisfied, assuming that the glitter dress (which he wants to gift) is not available. It would be a loss for both the eCommerce site for losing a valued customer and, of course, Bob for not being able to gift Alice.

However, this is a typical case of terminology mismatch between product description and the customer’s query, and this scenario happens a lot. In another scenario, languages like English have a lot of dialects. Words like “mobile phone” and “cell phone” mean the same but are in different dialects (British: Mobile phone, American: Cell phone). The product description might not cover all of these cases.

Now, consider that a word like apple changes its meaning based on the domain. Apple in the technology domain refers to a technology company, while in the food industry, it’s a healthy fruit. Cases like these elevate the importance of domain orientation of the synonyms generated.

To overcome such scenarios, we follow a process called Query Expansion (QE). Query expansion (QE) is a process in Information Retrieval that consists of selecting and adding terms to the user’s query to minimize query-document mismatch and improve retrieval performance. (- /Quoted From wiki/). For QE, we need to identify the domain-specific synonyms.

How do we generate product synonyms?

At Unbxd, we have broadly classified the synonym generation process into 3 categories:

  • Manual Curation
  • Language-Specific Synonyms from Linguistic Knowledge
  • Mining Synonyms from ClickStream Data

Manual Curation

It is a straightforward process. Here a skilled linguistic individual or community would manually contribute to the existing domain-specific synonyms. Quality would be quite high with this curation, but major cons would be that this process is resource and time-intensive

Language-Specific Synonyms from Linguistic Knowledge

There is vast public knowledge available for languages like English. For synonyms, we could leverage freely available lexical databases such as Wordnet, Conceptnet. The main issue with this approach is that the synonyms available would be generic (i.e., not domain-specific). Here at Unbxd, we filter domain-specific synonyms using some smart filtering algorithms based on clickstream data. One major con is that, with the evolution of new internet slang each day, it would be difficult for these lexical databases to catch up and stay up-to-date.

Mining Synonyms from ClickStream Data

Considering the high volume of clickstream data available, mining synonyms from

this data turns out to be quite cheap and of high quality. If Bob is familiar with fashion terminology, he would reformulate the query to *sequin* and retry when he doesn’t find a glitter dress. We leverage those scenarios to mine the collective intelligence from users’ reformulated searches and generate high-quality synonyms. This approach is heavily based on query chain analysis.

Generate Product Synonym Unbxd

Conclusion

We are able to successfully test and generate high-quality reliable synonyms using the discussed approaches on various domains like auto parts, technology, jewelry, and fashion. For Examples: In Auto parts domain; o2 compressor, oxygen compressor and in an online fashion store; sequin, glitter, etc. 

In this way, we ensure that any online shopper doesn’t leave the eCommerce site unhappy and unsatisfied and finds what s/he came looking for. eCommerce sites can’t leave money on the table just because their search couldn’t understand what the shopper meant by the search query. 

If you have been a victim of poor synonym matching capabilities in your search solution, reach out to sales@unbxd.com and we will be happy to walk you through in detail how Unbxd technology simplifies the query analysis for your eCommerce site.

More To Explore