资 源 简 介
This is a C++ implementation of the "space-saving" algorithm described in:
A. Metwally, D. Agrawal, and A. El Abbadi. Efficient Computation of Frequent and Top-k Elements in Data Streams. In Proceedings of the 10th ICDT International Conference on Database Theory, pages 398–412, 2005.
This project is released in the public domain - you can use the source code however you want.
The example program (runner.cpp) finds the most frequently occurring substrings of length N in a file.