首页| JavaScript| HTML/CSS| Matlab| PHP| Python| Java| C/C++/VC++| C#| ASP| 其他|
购买积分 购买会员 激活码充值

您现在的位置是:虫虫源码 > Python > 高性能中文分词模块,Python

高性能中文分词模块,Python

  • 资源大小:754.28 kB
  • 上传时间:2021-06-29
  • 下载次数:0次
  • 浏览次数:0次
  • 资源积分:1积分
  • 标      签: python 中文 分词 高性能 模块

资 源 简 介

pymmseg-cpp is a Python port of the rmmseg-cpp project. rmmseg-cpp is a MMSEG Chinese word segmenting algorithm implemented in C++ with a Ruby interface. Download the binary release on the right sidebar and copy the pymmseg directory to your Python"s path (e.g. /usr/lib/python2.5/site-packages/). Here"s an example of usage: ``` from pymmseg import mmseg mmseg.dictloaddefaults() text = # ... algor = mmseg.Algorithm(text) for tok in algor: print "%s [%d..%d]" % (tok.text, tok.start, tok.end) ``` Or you can download the source tarball or check out the latest code from the git repo hosted at github. Then you"ll need to build the mmseg-cpp module yourself: goto the mmseg-cpp subdirectory and run the build.py script. It will build the native module for you. For more information, refer to the 请点击左侧文件开始预览 !预览只提供20%的代码片段,完整代码需下载后查看 加载中 侵权举报

文 件 列 表

pymmseg-cpp
bin
pymmseg
data
__init__.py
mmseg.py
release.py
mmseg-cpp
algor.h
README
bin
VIP VIP
0.187828s