Corpus


The corpus is now available to download. The source collection can be downloaded for Hindi and Gujarati. The target collection with English news stories will soon be released. All the documents are with the following text markup.

<story>
<title>xxxxxx</title>
<date>xx-xx-xxxx</date>
<content>
xxxxxx
</content>
</story>

Source Collection



Target Collection



The relevance judgement and evaluation script files are available in Evaluation Section of this portal.



Notes:
Ask for the key to extract the collections by sending an email to clinss@dsic.upv.es with following information (all required)

Name
Affiliation
Purpose

Links

Home
Task Description  
Corpus  
Evaluation  
Working Notes  
Program Committee  
Registration/Discussion
Run Submission  
Program  
Contact: clinss@dsic.upv.es

Current Events

PAN @ CLEF

Previous Events

PAN @ FIRE'11