Visualizing Variable-Length Time Series Motifs


Yuan Li Jessica Lin Tim Oates
George Mason University University of Maryland, Baltimore County

Abstract.  The problem of time series motif discovery has received a lot of attention from researchers in the past decade. Most existing work on finding time series motifs require that the length of the motifs be known in advance. However, such information is not always available. In addition, motifs of different lengths may co-exist in a time series dataset. In this work, we develop a motif visualization system based on grammar induction. We demonstrate that grammar induction in time series can effectively identify repeated patterns without prior knowledge of their lengths. The motifs discovered by the visualization system are variable- lengths in two ways. Not only can the inter-motif subsequences have variable lengths, the intra-motif subsequences also are not restricted to have identical length - a unique property that is desirable, but has not been seen in the literature.

[pdf]


GrammarViz

We developed a grammar visualization system that allows users to navigate the grammar rules induced from the (discretized) time series. These rules are the motifs. Note that, unlike existing motif discovery algorithms, our notion of motifs are not defined based on some distance threshold. Rather, they are determined based on their symbolic approximation, and on the grammar rules.

While there are many algorithms for learning grammars from tokenized data, one particularly promising algorithm is Sequitur, which is a linear-time string compression and grammar induction algorithm. In this work, we used Sequitur to demonstrate the utilities of time series grammar induction, in particular, variable-length motif discovery.


Screenshot of the grammar/motif visualization system on the winding dataset. The motif found is approximately 140 in length.


Screenshot of the grammar/motif visualization system on the winding dataset. The motif length is about 370.


Screenshot of the grammar/motif visualization system on the winding dataset, without sliding window option. The motif length is about 470.


Screenshot of the grammar/motif visualization system on the insect dataset. The motif length is about 750.


Screenshot of the grammar/motif visualization system on insect dataset. The motif length is between 470 and 490.