You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In your dataset, every sample may have different sequence lengths. Yes, as you said, RNN cells accept variable seq len. But to some other models like attention-based ones, the sizes of their attention matrixes are fixed. Therefore, to accomplish this, you should find the max length of your samples and pad other samples with shorter sizes to the same length, i.e. (the max one). You can add a mask array called padding_mask to indicate which parts in samples are padded.
I did give a thought to this feature request. From my personal view, so far we cannot make out a general method or function for all models to use. Because this feature for each model is very specific and we have to make specific implementation for each model (at least for each kind of model), e.g. to self-attention models like SAITS, we can add additional attention masks for the padded parts to enable model training on variable-length input, and to RNN models like BRITS, we can leverage torch.nn.utils.rnn.pack_padded_sequence() and torch.nn.utils.rnn.pad_packed_sequence() to help with it.
Therefore, the workload will be very large. We can start with models we're familiar with to handle them one by one.
1. Feature description
To enable variable 'sequence length' of the input data.
2. Motivation
Some of the input training data are composed of multiple concatenated time series of different lengths.
3. Your contribution
I will try to help
The text was updated successfully, but these errors were encountered: