资 源 简 介
程序主要是对corpus进行统计,然后计算每句话的概率。由于考虑到汉字GB码的特性,在统计时先对内码进行处理,然后把统计结果直接放入对应的数组元素中,因此可以节省统计时字与间相互比较及词与词间相互比较的时间。-procedures are the major corpus statistics, and then calculate the probability of everything. Taking into consideration that the Chinese GB code, the characters of statistics for first internal code, then deposited directly into the results corresponding to the array elements, So when statistics can be saved with the mutual characters and words and more words to each other more time.