资 源 简 介
This is a naive Bayesian text classifier that, given a bit of text can tell you the posterior probability (returned as a log likelihood) that it comes from each of the standard scientific article sections (Introduction, Methods, Results, Discussion).
What do we mean by a bit of text, well anything you want really, from a few words to a whole article, you decide on the boundaries.
You can use it to:
Pull out all the bits of text from an article that are from your chosen section.
Score each bit of text in 4 dimensions (i.e. how introductory, methodological, results-based or conclusionary it is) which may be useful for finding similar text.
Use it to create training data etc.
This project contains:
A local version of the classifier, for better performance (see Downloads, or choose a version from the downloads list to the rig