Why is the discrepancy between two identical time-series negative? #10

karhohs · 2018-04-20T22:13:55Z

Hello,

I wanted to build up some intuition about how the soft-dtw cost function works by playing around with some fake data. I started comparing scaled and shifted cosines and wanted to see what numbers soft-dtw would spit out. However, when I compared a cosine to itself I couldn't understand why I was getting a negative result. Wouldn't the cumulative alignment cost for signal to itself be zero? Could someone please help me understand why the result is negative?

Also, when I compared similarly shifted cosines, but one was scaled by 0.5 and the other scaled by 2.0, the discrepancies were very different from each other. Why does scaling have such a strong effect on the scoring? Should I be self-normalizing my data before using soft-dtw?

I tried the following:

import numpy as np
from sdtw import SoftDTW
from sdtw.distance import SquaredEuclidean

t = np.arange(100)
X = np.cos(t).reshape(-1, 1)
Y = 0.5*np.cos(t + 1).reshape(-1, 1)
Z = 2.0*np.cos(t + 1).reshape(-1, 1)

Dxx = SquaredEuclidean(X, X)
Dxy = SquaredEuclidean(X, Y)
Dxz = SquaredEuclidean(X, Z)

sdtw_xx = SoftDTW(Dxx, gamma=1.0)
sdtw_xy = SoftDTW(Dxy, gamma=1.0)
sdtw_xz = SoftDTW(Dxz, gamma=1.0)

print(sdtw_xx.compute())
print(sdtw_xy.compute())
print(sdtw_xz.compute())

I got the following output:

-104.37094339998296
-106.85481189044374
-17.672483911503427

The text was updated successfully, but these errors were encountered:

mblondel · 2018-04-22T12:18:02Z

Like DTW, SDTW is not a distance. And as you noticed unlike DTW, SDTW(x, x) != 0. So we prefer to call SDTW a discrepancy.

The reason it's not zero is because SDTW considers all possible alignments weighted by their probability under the Gibbs distribution (see paper) while DTW considers only the best one.

marcocuturi · 2018-04-23T12:54:54Z

As is typically done for the GAKernel (http://marcocuturi.net/GA.html) one might want to normalize SDTW to give it that property SDTW_(x,y) := SDTW(x,y) - 1/2(SDTW(x,x)+SDTW(y,y)).

karhohs · 2018-04-23T14:34:55Z

@mblondel thank you for your insights. I looked at your paper and see that the discrepancy is the weighted sum of all possible alignments, which I believe is equation 1 in your paper. From the above cosine example, I observed that the discrepancy between a cosine and itself is greater than the discrepancy between a cosine and a slightly-shifted, scaled by 0.5 version of itself. Does this imply that alignments that account for scale are more penalizing than alignments that account for shifts? I believe I will self-normalize each of my time-series to their mean to help mitigate this bias. As @mblondel states in his paper, "Unlike the Euclidean distance, DTW can compare time series of variable size and is robust to shifts or dilatations across the time dimension." Perhaps DTW is not robust to scaling and amplitude changes of a time-series. Am I reading this all correctly?

Also, thanks for pointing out your normalization @marcocuturi. Can the normalization be performed before computing the SDTW discrepancy? Can I use the same equation to normalize the gradient of the alignment matrix if I wanted to use the normalized discrepancy as a fitting term?

FWIW, I used @marcocuturi suggestion and applied it to the cosines defined above. The results now resemble a distance metric:

t = np.arange(100)
X = np.cos(t).reshape(-1, 1)
Y = 0.5*np.cos(t + 1).reshape(-1, 1)
Z = 2.0*np.cos(t + 1).reshape(-1, 1)

Dxx = SquaredEuclidean(X, X)
Dxy = SquaredEuclidean(X, Y)
Dyy = SquaredEuclidean(Y, Y)
Dxz = SquaredEuclidean(X, Z)
Dzz = SquaredEuclidean(Z, Z)

sdtw_xx = SoftDTW(Dxx, gamma=1.0)
sdtw_xy = SoftDTW(Dxy, gamma=1.0)
sdtw_yy = SoftDTW(Dyy, gamma=1.0)
sdtw_xz = SoftDTW(Dxz, gamma=1.0)
sdtw_zz = SoftDTW(Dzz, gamma=1.0)

print(sdtw_xx.compute()-0.5*(sdtw_xx.compute()+sdtw_xx.compute()))
print(sdtw_xy.compute()-0.5*(sdtw_xx.compute()+sdtw_yy.compute()))
print(sdtw_xz.compute()-0.5*(sdtw_xx.compute()+sdtw_zz.compute()))

Here's the output:

0.0
16.233598135265396
57.79562607598451

Furthermore, the effect of scaling is more plainly seen after this normalization. If I define the Z cosine as 1.5*np.cos(t + 1).reshape(-1, 1) instead of 2.0*np.cos(t + 1).reshape(-1, 1), then the normalized discrepancy is 15.768949868955872 and very close to the 0.5 scaled cosine.

karhohs · 2018-06-21T05:37:56Z

I've been trying to cluster time-series data using soft-dtw and hierarchical clustering, but have been having a difficult time defining clusters that are obvious by eye. @marcocuturi 's normalization equation has helped, except that a dissimilarity metric seems to confound my clustering efforts since the very dissimilar scores appear to dominate the clustering. Towards this end I rescaled the normalized dissimilarity metric between (0,1] with the equation exp(-x). This appears to help, but I'm still falling short of expectations. What do you all think about this approach? Am I missing something obvious? I'm happy to share a gist of my jupyter notebook and some test data if this is helpful. Thanks!

mblondel · 2018-06-21T06:43:44Z

I am curious. Do you observe the same behavior with unregularized / exact DTW? Hierarchical clustering doesn't need gradients so you could try it too.

SDTW really shines when using it for k-means clustering, since the centroids / barycenters can be learned. I haven't added k-means to the repository yet, though.

Another option is to use the soft-DTW as a kernel and plug it into kernel k-means.

The kernel is given by sdtw_kernel(x, y; gamma) := exp(-sdtw(x, y; gamma)) .

mblondel · 2018-06-21T06:46:10Z

I believe this project supports SDTW-based k-means clustering:

https://github.com/rtavenar/tslearn

newman1234 mentioned this issue Sep 26, 2018

Silhouette score for softdtw seems to output incorrect value? tslearn-team/tslearn#70

Closed

rtavenar added a commit to tslearn-team/tslearn that referenced this issue Sep 26, 2018

silhouette score and softdtw fixed (cf #70 and mblondel/soft-dtw#10 )

e3739b7

mblondel mentioned this issue May 12, 2019

Negative discrepancy #5

Closed

Maghoumi mentioned this issue Sep 9, 2020

Steps chosen / choice penalty Maghoumi/pytorch-softdtw-cuda#2

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why is the discrepancy between two identical time-series negative? #10

Why is the discrepancy between two identical time-series negative? #10

karhohs commented Apr 20, 2018 •

edited

Loading

mblondel commented Apr 22, 2018

marcocuturi commented Apr 23, 2018

karhohs commented Apr 23, 2018 •

edited

Loading

karhohs commented Jun 21, 2018

mblondel commented Jun 21, 2018

mblondel commented Jun 21, 2018

Why is the discrepancy between two identical time-series negative? #10

Why is the discrepancy between two identical time-series negative? #10

Comments

karhohs commented Apr 20, 2018 • edited Loading

mblondel commented Apr 22, 2018

marcocuturi commented Apr 23, 2018

karhohs commented Apr 23, 2018 • edited Loading

karhohs commented Jun 21, 2018

mblondel commented Jun 21, 2018

mblondel commented Jun 21, 2018

karhohs commented Apr 20, 2018 •

edited

Loading

karhohs commented Apr 23, 2018 •

edited

Loading