资 源 简 介
HIT-SW database is to facilitate the offline recognition of writer-dependent Chinese handwritten text.
You can download it freely here.
1 Overview
A database of Chinese handwritten document is collected by our institute to evaluate the recognizer. It is written by a single writer, therefore we name it HIT-SW database (HIT is the abbr. of Harbin Institute of Technology, and SW means it is produced by a single writer) hence after. The homepage of HIT-SW database in English is https://code.google.com/p/hitswdatabase and the Chinese version is put at http://hi.baidu.com/hitmw/ .
The underlying text of HIT-SW database is randomly sampled from China Daily corpus. We ask the writer to handcopy the text with no ruler as reference on A4 size papers. The final samples are digitalized with a resolution of 300 DPI, and then binarized using Otsu algorithm. Totally there are 83 handwritten documents capturi