Learning To Rank with eCommerce data

Learning to rank with eCommerce data

What, Why and How of Learning to Rank:

Learning to Rank (LETOR) is used in the information retrieval (IR) class of problems, as ranking related documents is paramount to returning optimal results. A typical search engine, for example, indexes several billion documents. Building a ranking model that can surface pertinent documents based on a user query from an indexed document set is one of its core imperatives. To accomplish this, documents are grouped on user query relevance, domains, subdomains, and so on, and ranking is performed within each group. The initial ranking is based on the relevance judgement of an associated document based on a query in eCommerce.

Why Learning to Rank in eCommerce is important?

Like any other IR systems eCommerce product retrieval is directly related to the quality of data being shown to the end user which in turn grows the end goal of an eCommerce portal.

Why learning to rank in eCommerce is important
Research to substantiate the importance of intelligent ranking models

Approaches to Ranking:

Labeled training data that is grouped on the criteria described earlier are ranked primarily based on the following common approaches:

  • Point-wise: A single instance is used during learning and the gradient is computed using just that instance. In this approach, the position of the training instance in the document set isn’t considered in a ranked list, so irrelevant instances could be given more importance.
  • Pair-wise: An instance pair is chosen for every training instance during learning, and the gradient is computed based on the relative order between them.
  • List-wise: Multiple instances are chosen and the gradient is computed based on those set of instances.
Different approaches to ranking
Different approaches to Ranking

Introduction to Unbxd Approach:

We use MART (a List-wise approach) as our algorithm and XGBoost as the procedure to train an effective ranking algorithm. XGBoost uses the LambdaMART ranking algorithm (for boosted trees), which uses the pairwise-ranking approach to minimize pairwise loss by sampling many pairs. The algorithm itself is outside the scope of this post. For more information on the algorithm, see the paper, A Stochastic Learning-To-Rank Algorithm and its Application to Contextual Advertising. The pros and cons of the different ranking approaches are described in LETOR in IR.

Boosted Multiple Trees

The Challenges with Ranking models

The end-to-end process demands a ton of curated and processed data points to train visibly productive and efficient ranking results which can be costly with eCommerce data. The variance in different categories of data for each system makes it more difficult to design a generic procedure of processing that is both scalable and adaptive.

How can ranking help the eCommerce organizations and the end-user/customer?

The primary goal of any eCommerce website is to sell products to the customer. The company is happy when it can sell a product to a customer and the customer is happy when he can buy the product he wants to with ease. This relation makes both the parties easier to come in a deal as both are looking for the same outcome. Given the query made by a customer, it is very obvious that the customer is interested in buying  —  It now depends on the system to retrieve products that matches the query, and rank them based on certain indirect implications. Thus, an improvement in the ranking of retrieved products can impact the outcome of the buyer-seller system manifolds.

Use Case 

We will see how a ranking system can be intuitively useful for a simple interactive query to discover a product. The assumptions for this use case are:

1. We have 6 products in our inventory, and our UI shows 2 products on each page. Recall the importance of page 1 vs page 2 onwards from the research outlines.

2. The products have only two fields associated with them, say, Name and Color. Let our inventory look like this: 

Product Name, Color

Product ID

Apple iPhone 7, Black


iPhone 8, Blue


Apple Phone 11, Black


iPhone, Red


iPhone 7, Red


Apple iPhone, Blue


3. The query which the system gets “Blue iPhone 7”. 

4. And based on the past data it shows the same query “Blue iPhone 7” which was searched 100 times has clickstream data for the same products:

Secondary Assumption: These numbers are generated by another system which produces more than 6 products to the customer on the  same page, thus removing the bias of page 1 and rest of the pages. These data points are evidence of which product relates most to the query and will form the basis of our system’s ground truth on which the system will learn to Rank products. In such a way any number that gives a good estimate of the correct rank can be used as scores for our learning labels.

Now, without a ranking system the query “Blue iPhone 7” might fetch products in the some random order for example it may be presented as:

  • Page 1: A, B
  • Page 2: C, D
  • Page 3: E, F

Whereas the ideal placement for products as directed by the data from 100 query results should look like:

  • Page 1: E, A
  • Page 2: B, C
  • Page 3: D, F

Note: [C, D, F can be arranged in any order whatsoever]

That being said, let’s look into some features from our data that can help us learn the target function.

A very intuitive and basic feature set may look like:

  • Feature 1: Similarity measures (cosine, fuzzy, etc) between the Query vector and Name vector
  • Feature 2: NER parsed color vector (if present) from Query similarity with product color fields. 
  • Feature 3: String length of the Query
  • Feature 4: String length of the Name of product

And So on … [ Typically a feature set goes about 100-300 features.]

Training data for learning to rank models

These feature sets can hold a huge number of features ranging from a few hundreds to thousands. All the features can almost always be classified into 3 buckets:

  • Query Features (Feature 3)
  • Product/Document Features (Feature 4)
  • Dynamic/Query-Document Features (Feature 1, 2)

Based on these features and the given Ground Truth targets [clickstream scores] we can train a LambdaMart model to rank our products based on the nDCG scores.

Before and After Learning to Rank

Thus, we can now have a mathematical cost function [lambda] which can be optimized to reduce the error [ Increase the nDCG score ] associated with the ideal placement versus random placements, where the ideal placements are governed by the click stream data and are more intelligent in nature

How nDGC scores work in ranking models


There can be vast complexity in designing, processing and storing advanced feature sets, and it is beyond the scope of this article to dive into the implementation or mathematical formulation of these techniques. Hope this article provides an idea about how Learning to Rank systems can be helpful for an eCommerce platform and how Unbxd can help any platform to reap the rewards. 

More To Explore