Overview
The "ICDAR 2013 Recognition of Handwritten Historical Texts (RHHT)" competition is organised in the framework of the ICDAR 2013 competitions by the PRHLT group. Its goal is to bring together the different researchers working on off-line handwritten text recognition (HTR) and let them work on the specific application of off-line HTR techniques to historical texts, in order to provide them a suitable benchmark to compare their techniques.In this competition, the recognition of a database of historical handwritten text is proposed. The database corresponds to a manuscript written in Spanish during the 16th century by a single writer. The book has more than 800 pages with historical chronicles of Spain; most of the pages consist of a single block of well separated lines of calligraphical text (see page example below). The database is divided for the competition in three different parts: training, validation, and test.
Description and goals
The participant systems must try to obtain the most accurate recognition results of the test partition. The available data for achieving this task will consist of:- Image for each line of the training, validation, and test sets (see example below)
- Corresponding transcriptions for each line (only for training and validation sets)
Training partition is about 10,000 lines, whereas validation and test partitions are about 5,000 lines each. An extra test partition of a similar text, but from a different database, may be provided to check the robustness of the systems against small changes in language and handwritting style. A baseline system based on HTK and SRILM is provided, along with a set of scripts that performs a baseline training and test experiment. The participants can use this baseline system as an initial approach to their own systems, where they will be allowed to improve this baseline by using:
- different feature extraction techniques
- different recognition systems
- different types of models
- ...
Several submissions per participant will be allowed and all the results will be considered when presenting the competition results. In each submission, the participant must provide a brief description of the characteristics of the submitted system, emphasising the main differences between the submitted system and the baseline system. The final goal is to analyse the different proposals of the participants.
Evaluation modalities
The evaluation will be performed on the final result of the whole system (from preprocess to recognition). The evaluation metric is based on final word recognition, and Word Error Rate (WER) will be used to determine the performance of the systems. The winner will be the one that obtain the less WER on the test set. A web-based platform will be available for the participants to check their validation and test results.
Inscription and access to data
To inscribe in this contest send an e-mail to cmartine_AT_dsic_DOT_upv_DOT_es with the subject ICDAR 2013 RHHT competition inscription. In the message you must provide the following data:- Group name and acronym
- Institution
- Participants and e-mail
- Contact person
Schedule
- Feb, 15th, 2013 Competition opens, start of inscription period, training and validation data available, baseline system available.
- Apr, 1st, 2013, EXTENDED! Registration deadline (no more participants would be admitted).
- May, 1st, 2013, EXTENDED! Test data available
- May, 15th, 2013 Deadline for systems results