Coveo Data Challenge

In-session prediction for purchase intent and recommendations

    Click here for the data challenge overview paper

    Click here for the Leaderboards

    Join the data challenge slack channel: DC Slack

Welcome to the Data Challenge leaderboard page for the 2021 SIGIR Workshop on eCommerce! Training data, evaluation scripts and rules can be found in the official challenge repository; relevant literature and background information about the challenge and relevant industry use cases can be found in the challenge paper pre-print.

Challenge Overview

This challenge addresses the growing need for reliable predictions within the boundaries of a shopping session, as customer intentions can be different depending on the occasion. In the context of e-commerce technology, the feedback loop determined by behavioural signals spans from hours to a few seconds and machine learning models need to adapt as fast as possible to the continuously changing nature of the customer journey.

The need for efficient procedures for personalization is even clearer if we consider the e-commerce landscape more broadly: outside of giant digital retailers, the constraints of the problem are stricter, due to smaller user bases and the realization that most users are not frequently returning customers.

We release a new session-based dataset including fine-grained browsing events (detail, add, purchase), enriched by linguistic behavior (queries made by shoppers, with items clicked and items not clicked after the query) and catalog meta-data (image, text, pricing information). On this dataset, we ask participants to showcase innovative solutions for two open problems:

  1. a recommendation task, where a model is shown k events at the start of a session, and it is asked to predict future product interactions in the same session;
  2. an intent prediction task, where a model is shown a session containing an add-to-cart event, and it is asked to predict whether the item will be bought before the end of the session.
Please refer to the public repository for details on rules, evaluations and everything related to the dataset.


Organizing Committee


The organizers wish to thank Luca Bigon for his outstanding support in data collection, and Surya Kallumadi, Massimo Quadrana, Dietmar Jannach, Ajinkya Kale for precious feedback on a previous version of this paper. Finally, special thanks to Richard Tessier and Coveo's legal team for believing in this data sharing initiative.

System Description Paper

We solicit the submission of system papers which describe in detail modelling choices, data insights and interesting findings. System description papers will be peer reviewed (single-blind) by the program committee (we do not accept anonymized submissions): we accept contributions up to 4 pages (plus references and appendix if needed).

Typically, a system paper would include sections on data analysis, related work, architecture and experiments (with baselines) - a good example from last year can be found here. Please refer to the challenge paper for a list of interesting questions in the target domain.

Paper submissions can be made between June, 10th and June, 25th. All submissions should be made here and must be formatted according to the latest ACM SIG proceedings template. Please note that at least one author of each accepted paper must register for the workshop and present the paper.

Timeline  (UTC)
April 21 Data Challenge registration opens; Stage 1 opens
June 5 Data Challenge registration deadline
June 10 Stage 1 closes
June 11 Stage 2 opens; Paper submission opens
June 17 Stage 2 closes
June 25 Paper submission closes
July 7 Paper Accept/Reject
July 10 Camera ready paper submission deadline
July 15 Workshop

