资 源 简 介
TRMiner is a python tool that aims at scientific data curators.
It allows to rapidly prune large collections of scientific publications to sentences relevant for a given mining goal.
The approach works in two steps.
First, texts are tranlated into sequences of tokens for relevant words.
Second, regular expression patterns are searched in the token sequences.
Matches are translated back into natural language sentences and provided as HTML5 based output, that allows manual curators to sort and rate matches for further reading and information extraction.
Installation
You need to have Python >= 2.7.
TRMiner is listed in the Python Package Index. You can therefore install it with:
easy_install trminer
in a terminal.
Usage
TRMiner needs two configuration files. A token map, and a defintion of search patterns over the token alphabet.
The token map, e.g.
d domain b