You are here

Machine Learning Versus Automation in Site Search for eCommerce

Machine learning, AI, Deep Learning are trending buzzwords now. Especially with Site Search and Online Commerce.

But, how do you tell that you need machine learning? How do you know that a particular solution has machine learning?

Though the terms automation and machine learning are used interchangeably, the biggest difference between the two is that automation does not require any learning.

Automation is all about using machines to automate what humans would otherwise do manually. For instance in E-commerce Site-Search, a good automation based solution will use past purchase information to rank products with high sales numbers at the top. It might even use ‘clicks’ to automate showing highly viewed products to the top. This is very useful because many search systems today require humans to input rules to boost products to the top.

Machine Learning v:s Automation in Site Search for Ecommerce_image1

The flip side of automation is that it can go wrong with ranking within search results (like in the image above). If a ‘legging’ is selling really well, it’s great if it shows up when somebody searches for ‘pants’. The system is geared towards ranking the most popular products to be shown the most. What about 80%+ visitors who actually want to see pants, not just the most popular legging (which technically is also a pant)? This ends up catering the top 20% of best selling products to 20% of visitors shopping online.

So what is Machine Learning? If you need a system that not only automates human tasks but also sifts through a lot of data to identify patterns of predictability in data sets, you need Machine Learning. Machine Learning systems require intensive computations (a lot of CPUs) and need training data sets — a lot of data.

They will use that to predict, for instance, the likelihood of a user to click on a certain type of product after a certain search. Another use case would be to predict what other queries are similar to a particular query with different search results. The limiting factor for machine learning is the data — the more data it ingests, the more likely results will be realistic. Machine learning models cannot predict outside of the data that is fed in. This is an important thing — If somebody is offering machine learning and is not ingesting a HUGE amount of data, they are just automating, there is no machine learning in play.

A typical use case of machine learning when it ingests source of traffic, search queries, clicks on filters, click rank of products, product clicks, time spent on PDP pages, add-to-cart events, purchases, exits, repeat visits, non search clickstream events and web bases social trending data, it is likely to be trained well to perform.

Machine Learning v:s Automation in Site Search for Ecommerce_image2

With machine learning, while relevancy plays its part, it is able to determine that there is sufficient deviation in behavior when people search for ‘pants’ v/s ‘leggings’ on its own. It’s not programmed or automated, as is evident in the image above.

At Unbxd, when we compared a machine learning based search system (CTRs and conversion rates) on an A/B test with the best automated systems, they both behaved similarly on the top 5% of queries. However, the MC based search outperformed automated search by 2x (100% difference in conversions) with the rest 95%.This is because the top 5% search queries are heavily driven by sales data. The remaining 95% queries need more predictability — not just automation.

One thing to note though is to be realistic about what machine learning can deliver. You need significant data to make training models work. If your site has less traffic — it might take a while before it starts kicking in.

To summarize you can use this simple checklist to verify if there a site search has machine learning or not:

  1. Does site search ingest at-least 5 different types of signals?
  2. What machine learning models does it use?
  3. What are some patterns that these models have predicted?
  4. Is there a dedicated data sciences team?
  5. What’s the technical architecture for machine learning in play?


This article is the first in a series of super cool posts by Pavan which he pens down when he’s in a mood to share. You can read it on Medium, too. 

Related posts