资 源 简 介
Storage and distribution of the europa-corpus... with tools inside.
Why should I exploit "that" corpus ?
Because, documents, press releases from the European Community, are :
* written in 23 different languages
* semi-structured with xhtml tags
* homogeneously encoded in utf-8
Figures
23 different languages
български ( bg )čeština ( cs )dansk ( da )
Deutsch ( de )eesti ( et )ελληνικά ( el )
English ( en )español ( es )français ( fr )
Gaeilge ( ga )italiano ( it )lietuvių ( lt )
latviešu ( lv )magyar ( hu )Malti ( mt )
Nederlands ( nl )polski ( pl )português ( pt )
română ( ro )slovenčina ( sk )slovenščina ( sl )