一个java库包含一系列的tokenisers分手文本在其构成的词语。
资 源 简 介
jTokeniser is a set of classes that provide a variety of tokenisers for your Java projects. Simple tokenisers such as WhiteSpaceTokeniser or StringTokeniser provide basic token extraction whereas RegexTokeniser and BreakIteratorTokeniser give more advantage possibilities for more thorough tokenisers that discard punctuation too. Recent additions include RegexSeparatorTokeniser that allows complex definition of token delimiters. Also a SentenceTokeniser has been provided for segmenting text into a set of sentences.
There is also a GUI frontend to experiment without having to code.
文 件 列 表
jTokeniser-2.0.jar
lib
swing-layout-1.0.jar
README.txt