资 源 简 介
Extract-Tmx-Corpus is a Windows program (Vista and XP supported) that enables translators not necessarily with a deep knowledge of linguistic tools to create highly customised corpora that can be used with the Moses machine translation system and with other systems.
In order to create corpora that are most useful to train machine translation systems, one should strive to include segments that are relevant for the task in hand. One of the ways of finding such segments could involve the usage of previous translation memory files (TMX files). This way the corpora could be customised for the person or for the type of task in question. The present program uses such files as input.
The program can create strictly aligned corpora for a single pair of languages, several pairs of languages or all the pairs of languages contained in the TMX files.
The program creates 2 separa