;
100% пройдено
05.09
01.11

Competition is over

Description AIJ

Problem description

The goal of the competition is to develop an algorithm that can successfully answer questions and pass examination test based on information from publicly available data sources. Particpants are provided with test examples that can be used for validation and model learning. Competition solutions are sent to the automatic testing system and evaluated on the hidden set of questions.

 TEST EXAMPLES

Submission format

Each solution is an archive with code that runs in the Docker container environment. Solution archives are submitted into automatic testing system. Exam tasks with questions are graded automatically, essay writing tasks are graded by professional expert assesors 1-2 times a week. Competition metric is the total exam score on all test questions.

 BASELINE SOLUTION  SBER&HUAWEI BASELINE

Prizes

Total prize fund is 3 000 000 ₽!
The 1st place gets 1 000 000 ₽, 2nd place 500 000 ₽, 3rd place 300 000 ₽. 4th and 5th each get 200 000 ₽. 6th till 10th places each get 100 000 ₽.
On top of that there are 2 special nomination awards "Best question solution" and "Best essay solution" that are awarded with 150 000 ₽ each.

 COMPETITION RULES

Data format

Examination test is passed to the solution in JSON format. Test consists of a set of question tasks, resource and time constraints and metainformation (like test language).

Each question task object in test contains the following fields:

  • text - question task text. Suitable for markdown-style formatting. Inside text there can be links to attahcment files like graphic illustrations for the task.
  • attachments - set of attached files (with their id, mime-type).
  • meta - metainformation. Arbitrary key-value pairs available for solution and testing system. Used for providing structured information about task. Example: question source, originating exam topic.
  • answer -format description for expected answer type. Multiple question types are considered, each with their specific parameters and fields:
    • choice - choosing one option from the list;
    • multiple_choice - choosing a subset of options from the list;
    • order - arranging option from the list in correct order;
    • matching - correct matching of objects from two sets;
    • text - answer in the form of arbitrary text.
  • score - maximum number of points for the task. Based on this field solutions can prioritise computational resources between tasks.

Evaluation procedure

1. Check-phase
Solution is evaluated on publicly available set of questions with known answers. This phase is important for testing solutions for potential errors and issues with evaluation system interaction. Evaluation result and stdout/stderr output are fully available for the participant.

2. Public Test
Solution is evaluated on a hidden set of questions, available only for organisers. Tasks and answer options within tasks are randomly rearranged each evaluation.

3. Private Test
Solution is evaluated on the final set of questions. Results on the private test are the ones that determine competition winners.

Technical constraints

  • Solution containers are isolated from outside world:
    no internet access, no communication between parties.
  • RAM: 16 Gb;
  • Maximal solution archive size: 20Gb;
  • Maximal Docker-image size (publicly available): 20Gb;
  • Time limit on solution initialization (before task inference): 10 minutes
    This time is allocated for loading models into memory.
  • Time limit on providing answer for a single request: 30 minutes.

Evaluation criteria

Each question task is evaluated by a metric which is relevant to this task type:

  • choice - accuracy;
  • multiple_choice - union / intersection;
  • order - the proportion of correctly ordered pairs
  • matching - the proportion of correctly matched pairs;
  • text - special evaluation function, followed by a request for human-expert assessment.

Total solution score is the sum of scores across all question tasks. Each task scores are transformed to 100-point system based on official task correspondance table.

Essay evaluation

Solution evaluation on essay tasks comprises of two stages: automatic scoring and manual human-expert assessment.

Automatic procedure evaluates basic surface-level indicators of the generated texts:

  • no plagiarism;
  • original topic correspondance;
  • orphography;
  • sentence connectivity, tautology;
  • language errors (slang, swearing);
  • paragraph structure;
  • text volume (not too short/long).

Automatic scoring is given straight away and is not the final score. It is a helpful utility for participants.

Manual essay assessment is carried out by professional experts who follow the official grading standards of exam essays.

Results of manual essay assessments are served to the competition leaderboard 1-2 times a week.

In case automatic scoring indicates that manual essay assessment would lead to 0 points, participant is informed about it and is proposed to prepare a new solution for human assessment.

Baseline

Participants are provided with a fully functional baseline solution for this competition:

  • Question task classifier (1-27)
  • 27 separet task models, including questions and essay

Models are provided as a technical example as well as for internal validation against stronger solutions of participants.

Baseline model for essay passes through formal evaluation criteria, but doesn't pass through meaningful human assesment grading.

 Github repository  BASELINE SOLUTION  SBER&HUAWEI BASELINE

При поддержке

logo-huawey

FAQ

How to take part in AI Journey Competition?

Sign up for this competition with registration form. Develop your solution Submit your solution and see how it ranks among others. Solutions can be submitted again.

Is participation free?

Yes. Registration and participation are free.

Is this competition solo or teams are allowed?

Participants are allowed to team up. Each team will have no more than four players. All team members must be registered and indicated in team on the platform.

Could I join some time later?

Yes, solution submission will be available until 1th November 23:59:59 (UTC+3) incl.

When can I choose these final submits?

Choosing final submits will be available for participants from 4th September to 23:59:59 1th November (UTC+3).

When will the winners be selected?

TOP 10 winners will be defined and announced on the web site up to 23:59:59 4th November (UTC+3).

Who are eligible to participate in AI Journey competition?

All participants having reached the age of 18 who agreed with Rules and built a solution according to the description.

What are the prizes?

First place — 1 000 000 RUB, second place 500 000 RUB, third place 300 000 RUB, fourth and fifth places 200 000 RUB, sixth to tenth places 100 000 RUB. Moreover 2 special nominations "Best question solution" and "Best essay solution" will be awarded 150 000 RUB each.

Are people from other countries eligible to participate?

Yes they are eligible. Citizens of any country in the world are allowed to participate without restrictions.

When does the registration begin?

AI Journey competition runs from September 4 to November 1. Registration and solution submission will be available until 1th November 23:59:59 (UTC+3) incl.

Do participants have to choose their final submits?

Yes. Every participant should choose up to 2 solutions which will be scored for final evaluation. The best score of these 2 solutions will be the final result of a participant.

What are the solution evaluation criteria?

Solutions are evaluated automatically, based on the launch on closed test data, and their comparison with the true labels, available only to the organizers. The task rating is calculated online and updated in real time.

Will there be any award ceremony?

Yes. The award ceremony will take place on November 9 in Moscow during AI Journey Conference. Winners of the special nomination will also be announced and awarded on the event.

Does AI Journey have participation constrains?

Participation is not allowed for those who have direct or indirect relation to competition preparation of tasks and data by the organizers. Participants under these constraints who agreed with Rules could submit solutions but are not eligible for monetary prizes.