Rakuten Data Challenge

Multimodal Product Classification and Retrieval

Data Challenge Checklist

  • Registration is now closed!
  • Join the data challenge slack channel: DC Slack

  • Click here for the final Scoreboards


    Rakuten Multi-modal Product Data Classification and Retrieval challenge is organized by Rakuten Institute of Technology, the research and innovation department of Rakuten group. This challenge focuses on two topics, namely large-scale multi-modal (text and image) classification and cross-modal retrieval. The goal of the multi-modal classification task is to predict each product’s 'type code' as defined in the catalog of Rakuten France. In the cross-modal retrieval task, presented with the text of the products, the goal is to retrieve the images corresponding to the products.

    The cataloging of product listings through some type of text or image categorization method is a fundamental problem for any e-commerce marketplace, with applications ranging from personalized search and recommendations to query understanding. Manual and rule-based approaches to categorization are not scalable since commercial products are organized in many and sometimes thousands of classes. When actual users categorize product data, it has often been observed that not only the text of the title and description of the product is useful but also its associated images.


    In the taxonomy of Rakuten France, products sharing the same product type code share the same exact array of attributes fields and possible values. Product type codes are numbers that match a generic product name, such as 1500 is Watches, 120 is Laptops, and so on. In that sense, the type code of a product is its category label.

    In the product catalog of Rakuten France, a product with a French title Klarstein Présentoir 2 Montres Optique Fibre is associated with an image and sometimes with an additional description. This product is categorized with a product type code of 1500, signifying watches. There are other products with different titles, images and with possible descriptions, which are under the same product type code.

    Given these information on the products, like the example above, this challenge proposes participating teams build and submit systems that classify previously unseen products into their corresponding product type codes.

    The main tasks for this challenge are as follows:

    1. Multi-modal classification. Given a training set of products and their product type codes, predict the corresponding product type codes for an unseen held out test set of products. The systems are free to use the available textual titles and/or descriptions whenever available and additionally the images to allow for true multi-modal learning.
    2. Cross-modal retrieval. Given an held-out test set of product items with their titles and (possibly empty) descriptions, predict the best image from among a set of test images that correspond to the products in the test set.


    For this challenge, Rakuten France has released approximately 99K product listings in tsv format, including a training (84,916) and test set (13,812). The dataset consists of product titles, product descriptions, product images and their corresponding product type codes. The test set will be released towards the end of the data challenge. Furthermore, one can assume the test set has been generated from the same data distribution as the training set.

    A detailed description of the data is included in this pdf file.

    Participation and Submission

    Please register in this link for participation. The submission is team-based, only team leader can submit the prediction file. Also the same person cannot be team leader of multiple teams. There is no limit on maximum team size. We will send you the details about downloading the data file after we receive your sign-up.

    Participants need to provide a prediction output file in the same format as the training output file (associating each Integer_id, Image_id and Product_id tuple with the predicted Prdtypecode). The first line of this test output file should contain the header. Please DO NOT change the order of the test titles in your submission file.

    A detailed instruction of the format of the submissions is included in this pdf file.

    Accepted contributions will be presented during the eCom Workshop in SIGIR 2020.

    Evaluation Metric

    Since in this challenge, we are dealing with many classes with highly asymmetric number of samples, an item weighted metric used to rank the participants will not reveal the deficiencies of the classification algorithms. The evaluation script is included in the downloadable data file.

    • Task 1: We will use the macro-F1 score to evaluate product type code classification on held out test samples. The score is understood as the arithmetic average of per product type code F1 score.
    • Task 2: For the cross-modal retrieval task, the systems will be evaluated on recall at 1 (R@1) on held out test samples. The score is understood to be the average of the per-sample scoring of 1 if the image returned matches the title and 0 otherwise.
    Stage 1 - Model Building (April 20 - July 15)

    Participants build and test models on the training data. The leaderboard only shows the model performance on a SUBSET of the test set according to your LATEST submission. Each team can submit at most 4 times per day in this stage. The leaderboard freezes on July 15 at 5 PM PDT i.e. UTC 00:00:00 the next day.

    Stage 2 - Model Evaluation (July 15 - July 23)

    The final leaderboard will freeze on July 23 at 5 PM PDT i.e. UTC 00:00:00 the next day, and show the model performance on the remaining held out test set according to your LATEST submission. In this stage each team can submit at most 8 times during the time period that the evaluation is open, and there must be a period of 24 hours between two submissions.

    System Description Paper

    System description papers will be peer reviewed (single-blind) by the program committee. All submissions must be formatted according to the latest ACM SIG proceedings template. There will be no specific constraint on the content but it should cover the implementation details, such as data preprocessing, including token normalization and feature extraction, additional data used from external sources; model descriptions, including specific implementations, parameter tuning, etc. and error analysis, if any. Suggested paper length is 2-4 pages, and parameter tuning settings or similar information can be moved to an Appendix section.

    Submissions to SIGIR eCom should be made at https://easychair.org/my/conference?conf=sigirecom20dc

    Instructions to submit the system description paper should be available soon. The deadline for paper submission is July 17, 2020 (11: 59 P.M. UTC).

    April 20 Registration Opens, Evaluation Stage 1 Starts!
    July 15 Leaderboard for Stage 1 Closes (Registration closes)
    July 15 Evaluation Stage 2 Starts
    July 23 Evaluation Stage 2 Closes (Final Leaderboard)
    July 30 eCom Full Day Workshop

    Data Challenge Paper Submission Timeline
    July 17 System Description Paper Submission (Suggested paper length 2-4 pages with separate Appendix)
    July 23 Paper Acceptance Notification
    July 27 Camera Ready Version of Papers Due (Updated with your final methodology and results)

    If you have any question, please contact Hesam Amoualian (hesam.amoualian@rakuten.com) or Parantapa Goswami (parantapa.goswami@rakuten.com).

    Phase 1 Scoreboard
    Task 1: Multimodal Classification
    Rank Team Name Last Submission Macro-F1 score
    1 Transformers 2020 Jul 13 06:53:43 91.94
    2 zenit84 2020 Jun 27 17:33:34 91.63
    3 Alto 2020 Jul 14 15:40:57 91.63
    4 Beantown 2020 Jul 15 23:47:03 90.89
    5 Synerise AI 2020 Jul 07 13:45:38 89.72
    6 pa_curis 2020 Jul 15 17:38:26 89.65
    7 RIT-Paris Baseline 2020 Jul 15 10:48:14 87.05
    8 tester 2020 Jun 24 17:53:50 86.94
    9 testers 2020 Jul 08 16:19:24 85.87
    10 MMG_AI_TEAM 2020 Jul 15 11:27:08 84.81
    11 DeepData 2020 Jun 18 09:11:58 84.32
    12 overfiTTers 2020 Jul 12 12:57:20 81.9
    13 Team MLG 2020 Jul 15 05:01:11 65.8
    14 qrudraksh 2020 May 27 20:03:26 58.0
    15 7ate9 2020 Jun 10 04:39:36 53.29
    Phase 2 Scoreboard
    Task 1: Multimodal Classification
    Rank Team Name Last Submission Macro-F1 score
    1 pa_curis 2020 Jul 21 10:25:24 91.44
    2 Alto 2020 Jul 23 21:35:59 90.87
    3 Transformers 2020 Jul 23 16:23:52 90.53
    4 zenit84 2020 Jul 23 22:22:36 90.39
    5 Beantown 2020 Jul 22 03:58:44 90.22
    6 Synerise AI 2020 Jul 17 04:42:21 89.78
    7 MMG_AI_TEAM 2020 Jul 22 13:22:00 86.94
    8 RIT-Paris Baseline 2020 Jul 18 09:19:53 85.36
    9 Team MLG 2020 Jul 17 01:22:23 64.48
    Task 2: Cross-modal Retrieval
    Rank Team Name Last Submission Recall@1 score
    1 Synerise AI 2020 Jul 01 12:30:48 50.23
    2 changer 2020 Jul 12 11:14:59 46.85
    3 pa_curis 2020 May 27 09:34:52 41.89
    4 Beantown 2020 Jul 15 20:38:29 38.96
    5 Alto 2020 Jul 03 01:49:35 38.29
    6 MMG_AI_TEAM 2020 Jul 15 11:29:07 27.25
    7 kenneth 2020 Jun 16 20:12:10 1.35
    Task 2: Cross-modal Retrieval
    Rank Team Name Last Submission Recall@1 score
    1 Synerise AI 2020 Jul 17 19:04:22 34.28
    2 changer 2020 Jul 21 16:59:52 31.93
    3 Beantown 2020 Jul 22 17:05:28 23.3
    4 Alto 2020 Jul 20 23:40:02 19.99
    5 pa_curis 2020 Jul 23 13:19:58 19.74
    6 MMG_AI_TEAM 2020 Jul 20 12:39:02 15.77
    Data Challenge Organizers

    Hesam Amoualian    Rakuten Institute of Technology, Paris
    Parantapa Goswami     Rakuten Institute of Technology, Paris
    Laurent Ach     Rakuten Institute of Technology, Paris
    Pradipto Das    Rakuten Institute of Technology, Americas
    Pablo Montalvo     Rakuten Institute of Technology, Paris