资 源 简 介
Here are the most commonly used algorithms and auxiliary utilities for fuzzy (similarity) string search in large dictionaries written in Java.
Levenshtein Distance (with cutoff and prefix version)
Damerau-Levenshtein Distance (with cutoff and prefix version)
Extension (Spell-checker) Method
N-Gram Method (with some modifications)
Signature Hash Method
Bitap (Shift-Or with Wu-Manber modifications)
Burkhard-Keller (BK) Trees
Skip algorithm
All implementations are aimed to provide simplicity and clarity of algorithm"s work.
Related articles are at http://ntz-develop.blogspot.com/
You can checkout sources from svn repository at http://code.google.com/p/fuzzy-search-tools/source/browse/ or download source snapshot at http://code.google.com/p/fuzzy-search-tools/downloads/list
文 件 列 表
src
ru
fuzzysearch
.svn
text-base
ru
fuzzysearch
OnlineSearcher.java
ExtensionIndex.java
BinarySearch.java
IntArrays.java
FuzzySearch.java
DamerauLevensteinMetric.java
LevensteinMetric.java
NGramSearcherM2.java
NGramSearcher.java
IntComparator.java
NGramIndexerM2.java
SkipSearcher.java
NGramIndexer.java
RussianAlphabet.java
SkipIndexer.java
WordSearcher.java
ExtensionSearcher.java
NGramIndexM1.java
EnglishAlphabet.java
SimpleAlphabet.java
ExtensionIndexer.java
SignHashIndex.java
Searcher.java
WordIndex.java
BKTreeIndex.java
MetricOnlineSearcher.java
Dictionary.java
Index.java
NGramSearcherM1.java
Alphabet.java
BitapOnlineSearcher.java
NGramIndexerM1.java
SignHashSearcher.java
Normalizer.java
SignHashIndexer.java
Metric.java
BKTreeSearcher.java
UnionAlphabet.java
BKTreeIndexer.java
NGramIndexM2.java
NGramIndex.java
WordOnlineSearcher.java
SkipIndex.java
Indexer.java
phonetic
MetaphoneRussian.java
.svn
all-wcprops
entries
entries
all-wcprops
entries
all-wcprops
all-wcprops
entries
text-base
MetaphoneRussian.java.svn-base
BKTreeIndexer.java.svn-base
OnlineSearcher.java.svn-base
ExtensionIndex.java.svn-base
BinarySearch.java.svn-base
IntArrays.java.svn-base
FuzzySearch.java.svn-base
DamerauLevensteinMetric.java.svn-base
LevensteinMetric.java.svn-base
NGramSearcherM2.java.svn-base
NGramSearcher.java.svn-base
IntComparator.java.svn-base
NGramIndexerM2.java.svn-base
SkipSearcher.java.svn-base
NGramIndexer.java.svn-base
RussianAlphabet.java.svn-base
SkipIndexer.java.svn-base
WordSearcher.java.svn-base
ExtensionSearcher.java.svn-base
NGramIndexM1.java.svn-base
EnglishAlphabet.java.svn-base
SimpleAlphabet.java.svn-base
ExtensionIndexer.java.svn-base
SignHashIndex.java.svn-base
Searcher.java.svn-base
WordIndex.java.svn-base
BKTreeIndex.java.svn-base
MetricOnlineSearcher.java.svn-base
Dictionary.java.svn-base
Index.java.svn-base
NGramSearcherM1.java.svn-base
Alphabet.java.svn-base
BitapOnlineSearcher.java.svn-base
NGramIndexerM1.java.svn-base
SignHashSearcher.java.svn-base
Normalizer.java.svn-base
SignHashIndexer.java.svn-base
Metric.java.svn-base
BKTreeSearcher.java.svn-base
UnionAlphabet.java.svn-base
NGramIndexM2.java.svn-base
NGramIndex.java.svn-base
WordOnlineSearcher.java.svn-base
SkipIndex.java.svn-base
Indexer.java.svn-base