资 源 简 介
ApproxMAP : Approximate Sequential Pattern Mining via Multiple Alignment
Sequential pattern mining is an important data mining task with broad applications. Conventional methods meet inherent difficulties in mining databases with long sequences and noise. They may generate a huge number of short and trivial patterns but fail to find the interesting underlying patterns. To attack these problems, in this project we propose the theme of approximate sequential pattern mining roughly defined as identifying patterns approximately shared by many sequences.
We present an efficient and effective algorithm, ApproxMAP (APPROXimate Multiple Alignment Pattern mining), to mine approximate consensus sequential patterns from large databases. The method works in three steps. First, sequences are clustered by similarity. Second, the clusters are compressed into weighted sequences through multiple alignment. Third, the longest underlying pattern best fitting each cluster is generated