You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Could it be that in the Mackey Glass experiment, the network is asked to approximate a 15-step delay, instead of simulating a complex system?
The definition of the X and Y data is as follows:
Y = X[:, :-predict_length, :]
X = X[:, predict_length:, :]
This implies the X data starts at tilmestep predict_length (15) and ends at the end of the series. Y starts at t=0 and runs until 15 steps before the end of the series. From what I understand, this means the network has seen all the values it is takes to predict beforehand. Is this intentional?
The text was updated successfully, but these errors were encountered:
Hi @creativedutchmen. Thanks for reviewing the code and bringing this to our attention. I think what happened is that, at one point, I was testing the network's ability to compute a delay line, and forgot to change it back. It's a shame that this wasn't noticed until now, especially since yesterday night was the cutoff for making final changes to the paper.
Fortunately this doesn't affect the nature of the results. This isn't too surprising since the future state of a strange attractor can be predicted using a delay embedding (Takens, 1981). In other words, the history of the time-series (a.k.a., delays) is useful to predict its future. In fact, the MG dataset is generated by a delay differential equation. This recapitulates the point that it's difficult for LSTMs to learn pure delays.
The default initializers for the LMUCell should be taken with a grain of salt. Using the uniform initializer instead starts the network off with a much more reasonable error for this task. We also found that a trainable bias vector is needed in the hidden layer for this task.
Let me know if you have any thoughts. Thanks again, and I am currently running a couple more experiments and discussing with my coauthors how to review this mistake and move forward with publishing a correction.
Could it be that in the Mackey Glass experiment, the network is asked to approximate a 15-step delay, instead of simulating a complex system?
The definition of the X and Y data is as follows:
This implies the X data starts at tilmestep
predict_length
(15) and ends at the end of the series. Y starts at t=0 and runs until 15 steps before the end of the series. From what I understand, this means the network has seen all the values it is takes to predict beforehand. Is this intentional?The text was updated successfully, but these errors were encountered: