资 源 简 介
Some Named Entity Recognition evaluation for MALLET, including English and Chinese samples
20110422 MALLET-EVAL PROJECT
GENERAL
This is a project for evaluating MALLET (MAchine Learning for
LanguagE Toolkit). MALLET"s binary and source codes are not included,
you can check out them from this site:
http://mallet.cs.umass.edu/
This distribution only contains sample annotation data and scripts for
converting, importing and evaluating. The articles in the two corpora are not included
for copyright reasons. That is why you need their cds for building the complete data
sets.
We provide two sample corpora: Penn Treebank Sample (5% fragment of Penn
Treebank) and HIT CIR LTP Corpora Sample (10% fragemnt of the whole
Corpora)
http://web.mit.edu/course/6/6.863/OldFiles/share/data/corpora/treebank/
http://ir.hit.edu.cn/demo/ltp/Sharing_Plan.htm
BUILDING