Latest news:

Jan, 2010:
Webpage updated.


ICPR 2010 Contest: Bi-Modal HTR Recognition


The biMod-IAM-PRHLT corpus is a bimodal dataset of on-line and off-line handwritten text. It is composed of a set of handwritten words (500 aprox.) with several word instances of each of the on-line and off-line modalities. Both on-line and off-line samples are word-size segments, extracted from hand-copied sentences of the electronic-text LOB corpus. The writers of the on-line and off-line samples are (generally) different. The off-line samples are presented as grey-level images (png format), and the on-line samples are sequences of X-Y coordinates (Unipen format) describing the trajectory of an electronic pen while writing the same word. The task consists on the classification of bi-modal samples (on- and off-line handwritten words) into a set of 519 classes (words). This task is simple enough so that experiments are easy to run and results are not affected by alien factors (such as language model estimation issues, etc.). On the other hand, it pretend to entail the essential challenges of the considered bi-modal fusion problem.




Basic statistics of the BIMOD-IAM-PRHLT corpus and their standard partitions.

on-line off-line
Word clases (vocabulary) 519 519
running words
training 8342 14409
validation 519 519
(hidden) test 519 519
total 9380 15447

Ground truth:

Validation data: double checked
Test hidden data: double checked
Training data: Semiautomatically produced and checked


ASCII (Ground truth, lists, etc.),
PNG (off-line handwritten text images)
UNIPEN (on-line handwritten text samples)

Corpus package contents:

Evaluation measures:

Classification error rate

Baseline results on the validation data:

Standard preprocessing and feature extraction procedures and character-based HMM word models have been used.

The best classification error rates were 6.6% for the on-line modality with 12 average states per model and 8 Gaussians per state and 27.6% for the off-line modality of with 8 average states per model and 64 Gaussians per state.

If we consider each HMM as a whole model, we can compute a lower bound of the classification error as the rate with which both on-line and off-line models produce a wrong hypothesis for the same input. For validation data, this lower bound error rate was 2.3%.

To obtain baseline bi-modal results, we used the naive Bayes classifier. In practice, we used a weighted-log version of this classifier (also assuming uniform priors), which aims at balancing the relative reliability of the on-line (x) and off-line (y) models:

ĉ=argmax ((1-α) log P(x|c) + α log P(y|c))

on-line off-line Lower bound Naive Bayes Relative improv.
6.6 27.6 2.3 4.0 39%


[1] U.-V. Marti and H. Bunke. The IAM-database: an English sentence database for offline handwriting recognition. Journal on Document Analysis and Recognition, 5:39-46, 2002.

[2] M. Liwicki and H. Bunke. IAM-OnDB - an on-line English sentence database acquired from handwritten text on a whiteboard. In Proc. 8th Int. Conf. on Document Analysis and Recognition, pages 956-961, 2005.

[3] M. Pastor E. Vidal and F. Casacuberta. A bi-modal handwritten text corpus. Technical report, Instituto Tecnológico de Informática,, September 2009. Available:pdf(211KB).