Skip to content

Commit a04eb84

Browse files
committed
Update 2_dtw.md
1 parent 19f941a commit a04eb84

File tree

1 file changed

+11
-12
lines changed

1 file changed

+11
-12
lines changed

docs/2_method/2_dtw.md

Lines changed: 11 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -10,25 +10,24 @@ Dynamic time warping is a well-known technique for manipulating time series to e
1010

1111
## DTW Algorithm
1212

13-
Consider a time series to be a vector of some arbitrary length. Consider that we have ($p$) such vectors in total, each possibly differing in length. To find a subset of ($k$) clusters within the set of ($p$) vectors using MIP formulation, we must first make $\frac{1}{2} {p \choose 2}$ pairwise comparisons between all vectors within the total set and find the `similarity' between each pair. In this case, the similarity is defined as the DTW distance. Consider two time series ($x$) and ($y$) of differing lengths ($n$) and ($m$) respectively,
13+
Consider a time series to be a vector of some arbitrary length. Consider that we have ($$p$$) such vectors in total, each possibly differing in length. To find a subset of ($$k$$) clusters within the set of ($$p$$) vectors using MIP formulation, we must first make $$\frac{1}{2} {p \choose 2}$$ pairwise comparisons between all vectors within the total set and find the `similarity' between each pair. In this case, the similarity is defined as the DTW distance. Consider two time series ($$x$$) and ($$y$$) of differing lengths ($$n$$) and ($$m$$) respectively,
1414

1515
$$
16-
x=(x_1, x_2, ..., x_n)
17-
$$
18-
$$
19-
y=(y_1, y_2, ..., y_m).
16+
\begin{aligned}
17+
x&=(x_1, x_2, ..., x_n)\\
18+
y&=(y_1, y_2, ..., y_m).
2019
$$
2120

2221
The DTW distance is the sum of the Euclidean distance between each point and its matched point(s) in the other vector. The following constraints must be met:
2322

2423
1. The first and last elements of each series must be matched.
25-
2. Only unidirectional forward movement through relative time is allowed, i.e., if $x_1$ is mapped to $y_2$ then $x_2$ may not be mapped to
26-
$y_1$ (monotonicity).
24+
2. Only unidirectional forward movement through relative time is allowed, i.e., if $$x_1$$ is mapped to $$y_2$$ then $$x_2$$ may not be mapped to
25+
$$y_1$$ (monotonicity).
2726
3. Each point is mapped to at least one other point, i.e., there are no jumps in time (continuity).
2827

29-
![Two time series with DTW pairwise alignment between each element, showing one-to-many mapping properties of DTW (left). Cost matrix $C$ for the two time series, showing the warping path and final DTW cost at $C_{14,13}$ (right).](../../media/Merged_document.pdf)
28+
![Two time series with DTW pairwise alignment between each element, showing one-to-many mapping properties of DTW (left). Cost matrix $$C$$ for the two time series, showing the warping path and final DTW cost at $$C_{14,13}$$ (right).](../../media/Merged_document.png)
3029

31-
Finding the optimal warping arrangement is an optimisation problem that can be solved using dynamic programming, which splits the problem into easier sub-problems and solves them recursively, storing intermediate solutions until the final solution is reached. To understand the memory-efficient method used in ''DTW-C++``, it is useful to first examine the full-cost matrix solution, as follows. For each pairwise comparison, an ($n$) by ($m$) matrix $C^{n\times m}$ is calculated, where each element represents the cumulative cost between series up to the points $x_i$ and $y_j$:
30+
Finding the optimal warping arrangement is an optimisation problem that can be solved using dynamic programming, which splits the problem into easier sub-problems and solves them recursively, storing intermediate solutions until the final solution is reached. To understand the memory-efficient method used in ''DTW-C++``, it is useful to first examine the full-cost matrix solution, as follows. For each pairwise comparison, an ($$n$$) by ($$m$$) matrix $$C^{n\times m}$$ is calculated, where each element represents the cumulative cost between series up to the points $$x_i$$ and $$y_j$$:
3231

3332
\[
3433
c_{i,j} = (x_i-y_j)^2+\min \left\{
@@ -38,11 +37,11 @@ c_{i-1,j-1} & c_{i-1,j} & c_{i,j-1}
3837
\right\}
3938
\]
4039

41-
The final element $c_{n,m}$ is then the total cost, $C_{x,y}$, which provides the comparison metric between the two series $x$ and $y$. \autoref{fig:warping_signals} shows an example of this cost matrix $C$ and the warping path through it.
40+
The final element $$c_{n,m}$$ is then the total cost, $$C_{x,y}$$, which provides the comparison metric between the two series $$x$$ and $$y$$. \autoref{fig:warping_signals} shows an example of this cost matrix $$C$$ and the warping path through it.
4241

43-
For the clustering problem, only this final cost for each pairwise comparison is required; the actual warping path (or mapping of each point in one time series to the other) is superfluous for k-medoids clustering. The memory complexity of the cost matrix $C$ is $O(nm)$, so as the length of the time series increases, the memory required increases greatly. Therefore, significant reductions in memory can be made by not storing the entire $C$ matrix. When the warping path is not required, only a vector containing the previous row for the current step of the dynamic programming sub-problem is required (i.e., the previous three values $c_{i-1,j-1}$, $c_{i-1,j}$, $c_{i,j-1}$).
42+
For the clustering problem, only this final cost for each pairwise comparison is required; the actual warping path (or mapping of each point in one time series to the other) is superfluous for k-medoids clustering. The memory complexity of the cost matrix $$C$$ is $$O(nm)$$, so as the length of the time series increases, the memory required increases greatly. Therefore, significant reductions in memory can be made by not storing the entire $$C$$ matrix. When the warping path is not required, only a vector containing the previous row for the current step of the dynamic programming sub-problem is required (i.e., the previous three values $$c_{i-1,j-1}$$, $$c_{i-1,j}$$, $$c_{i,j-1}$$).
4443

45-
The DTW distance $C_{x,y}$ is found for each pairwise comparison. Pairwise distances are then stored in a separate symmetric matrix, $D^{p\times p}$, where ($p$) is the total number of time series in the clustering exercise. In other words, the element $d_{i,j}$ gives the distance between time series ($i$) and ($j$).
44+
The DTW distance $$C_{x,y}$$ is found for each pairwise comparison. Pairwise distances are then stored in a separate symmetric matrix, $$D^{p\times p}$$, where ($$p$$) is the total number of time series in the clustering exercise. In other words, the element $$d_{i,j}$$ gives the distance between time series ($$i$$) and ($$j$$).
4645

4746
### Warping Window
4847

0 commit comments

Comments
 (0)