[Question] How to pass a Pytorch module to the constructor of the RNNModel #2082

ChristophKarlHeck · 2023-11-22T06:34:33Z

Hi guys,
according to the documentation (https://unit8co.github.io/darts/generated_api/darts.models.forecasting.rnn_model.html) it should be possible to pass a PyTorch module with the same specifications as darts.models.rnn_model._RNNModule to the constructor of the RNNModel.
I created such a custom model but got always the error:

CustomRNNModule.forward()` got an unexpected keyword argument 'name'

I did some research and figured out that in the https://github.com/unit8co/darts/blob/master/darts/models/forecasting/rnn_model.py in line 481 until 490:

 model = self.rnn_type_or_module(
                name="custom_module",
                input_size=input_dim,
                target_size=output_dim,
                nr_params=nr_params,
                hidden_dim=self.hidden_dim,
                dropout=self.dropout,
                num_layers=self.n_rnn_layers,
                **self.pl_module_params,
            )

the already passed module will be created again, which is not possible because either at that point the method forward should be called or the following statement in line 444 until 452:

            raise_if_not(
                isinstance(model, nn.Module),
                '{} is not a valid RNN model.\n Please specify "RNN", "LSTM", '
                '"GRU", or give your own PyTorch nn.Module'.format(
                    model.__class__.__name__
                ),
                logger,
            )

is wrong.
PyTorch module with the same specifications as darts.models.rnn_model._RNNModule.:

from typing import Optional, Tuple
import torch


from darts.logging import get_logger
from darts.models.forecasting.pl_forecasting_module import (
    PLDualCovariatesModule,
    io_processor,
)


logger = get_logger(__name__)


class CustomRNNModule(PLDualCovariatesModule):
    """
    Custom LSTM Module
    """

    def __init__(
        self,

        # The name of the specific PyTorch RNN module ("RNN", "GRU" or "LSTM").
        name: str,

        # The dimensionality of the input time series.
        input_size: int,

        # The number of features in the hidden state `h` of the RNN module.
        hidden_dim: int,

        # The number of recurrent layers.
        num_layers: int,

        # The dimensionality of the output time series.
        target_size: int,

        # The number of parameters of the likelihood (or 1 if no likelihood is used).
        nr_params: int,

        # The fraction of neurons that are dropped in all-but-last RNN layers.
        dropout: float = 0.0,

        #all parameters required for :class:`darts.model.forecasting_models.PLForecastingModule` base class.
        **kwargs,
    ):
        
        # RNNModule doesn't really need input and output_chunk_length for PLModule
        super().__init__(**kwargs)

        # Defining parameters
        self.target_size = target_size
        self.nr_params = nr_params
        self.name = name

        # lstm1, lstm2, linear are all layers in the network
        self.lstm1 = torch.nn.LSTM(input_size, hidden_dim, num_layers, dropout=dropout)
        self.lstm2 = torch.nn.LSTM(hidden_dim, hidden_dim, num_layers, dropout=dropout)
        self.linear = torch.nn.Linear(hidden_dim, target_size * nr_params)

    @io_processor
    def forward(self, x_in: Tuple, h: Optional[torch.Tensor] = None
                ) -> Tuple[torch.Tensor, torch.Tensor]:
        # Tuple: (batch_size, input_length, input_size)
        # batch_size = 1
        # input_length = number of time steps
        # input_size = The number of expected features in the input x
        x, _ = x_in

        # data is of size (batch_size, input_length, input_size)
        batch_size = x.shape[0]

        # out is of size (batch_size, input_length, hidden_dim)
        # LSTM Layers
        out, last_hidden_state = self.lstm1(x) if h is None else self.lstm1(x, h)
        out, last_hidden_state = self.lstm2(out) if h is None else self.lstm2(out, h)

        # Here, we apply the V matrix to every hidden state to produce the outputs
        predictions = self.linear(out)

        # predictions is of size (batch_size, input_length, target_size)
        predictions = predictions.view(batch_size, -1, self.target_size, self.nr_params)

        # returns outputs for all inputs, only the last one is needed for prediction time
        return predictions, last_hidden_state

    

    def _produce_train_output(self, input_batch: Tuple) -> torch.Tensor:
        (
            past_target,
            historic_future_covariates,
            future_covariates,
            static_covariates,
        ) = input_batch
        # For the RNN we concatenate the past_target with the future_covariates
        # (they have the same length because we enforce a Shift dataset for RNNs)
        model_input = (
            torch.cat([past_target, future_covariates], dim=2)
            if future_covariates is not None
            else past_target,
            static_covariates,
        )
        return self(model_input)[0]
    
    def _produce_predict_output(
        self, x: Tuple, last_hidden_state: Optional[torch.Tensor] = None
    ) -> Tuple[torch.Tensor, torch.Tensor]:
        """overwrite parent classes `_produce_predict_output` method"""
        output, hidden = self(x, last_hidden_state)
        if self.likelihood:
            if self.predict_likelihood_parameters:
                return self.likelihood.predict_likelihood_parameters(output), hidden
            else:
                return self.likelihood.sample(output), hidden
        else:
            return output.squeeze(dim=-1), hidden
        
    def _get_batch_prediction(
        self, n: int, input_batch: Tuple, roll_size: int
    ) -> torch.Tensor:
        """
        This model is recurrent, so we have to write a specific way to
        obtain the time series forecasts of length n.
        """
        (
            past_target,
            historic_future_covariates,
            future_covariates,
            static_covariates,
        ) = input_batch

        if historic_future_covariates is not None:
            # RNNs need as inputs (target[t] and covariates[t+1]) so here we shift the covariates
            all_covariates = torch.cat(
                [historic_future_covariates[:, 1:, :], future_covariates], dim=1
            )
            cov_past, cov_future = (
                all_covariates[:, : past_target.shape[1], :],
                all_covariates[:, past_target.shape[1] :, :],
            )
            input_series = torch.cat([past_target, cov_past], dim=2)
        else:
            input_series = past_target
            cov_future = None

        batch_prediction = []
        out, last_hidden_state = self._produce_predict_output(
            (input_series, static_covariates)
        )
        batch_prediction.append(out[:, -1:, :])
        prediction_length = 1

        while prediction_length < n:

            # create new input to model from last prediction and current covariates, if available
            new_input = (
                torch.cat(
                    [
                        out[:, -1:, :],
                        cov_future[:, prediction_length - 1 : prediction_length, :],
                    ],
                    dim=2,
                )
                if cov_future is not None
                else out[:, -1:, :]
            )

            # feed new input to model, including the last hidden state from the previous iteration
            out, last_hidden_state = self._produce_predict_output(
                (new_input, static_covariates), last_hidden_state
            )

            # append prediction to batch prediction array, increase counter
            batch_prediction.append(out[:, -1:, :])
            prediction_length += 1

        # bring predictions into desired format and drop unnecessary values
        batch_prediction = torch.cat(batch_prediction, dim=1)
        batch_prediction = batch_prediction[:, :n, :]
        return batch_prediction

Creating instance of the defined custom module and passing it to the constructor of the RNNModel:

import torch

from darts.timeseries import TimeSeries
from darts.models import RNNModel
#from forecasting_temp.rnn_model import RNNModel
from pytorch_lightning.callbacks.early_stopping import EarlyStopping
from custom_modules.custom_rnn_module import CustomRNNModule


def train_custom_lstm(
    train: list[TimeSeries],
    test: list[TimeSeries],
    input_chunk_length: int,
    output_chunk_length: int,
    model_name="custom_lstm",
) -> bool:
    """
    Custom LSTM Training Method
    """
    ###---SET MODEL PARAMETERS---###

    earlystoppercb = EarlyStopping(
        monitor="val_loss", patience=10, min_delta=0.05, mode="min"
    )

    pl_trainer_kwargs = {
        "accelerator": "gpu",  # allow to use all gpus
        "devices": -1,  # use gpu
        "callbacks": [earlystoppercb],  # include callbacks
    }

    save_path = "models/custom_lstm_trained.pt"

    custom_lstm_module = CustomRNNModule(
        name = "custom_lstm_module",
        input_size = 1, # The dimensionality of the input time series.
        hidden_dim = 20, # The number of features in the hidden state `h` of the RNN module.
        num_layers = 2, # The number of recurrent layers.
        target_size = 1, # The dimensionality of the output time series.
        nr_params = 1, # The number of parameters of the likelihood (or 1 if no likelihood is used).
        dropout = 0.0, # The fraction of neurons that are dropped in all-but-last RNN layers.
        input_chunk_length=input_chunk_length,
        output_chunk_length=output_chunk_length
    )

    ###---CREATE MODEL---###
    model = RNNModel(input_chunk_length=input_chunk_length,
                     model=custom_lstm_module,
                     hidden_dim=20,
                     n_rnn_layers=1,
                     dropout=0.0,
                     training_length=output_chunk_length,  # size of input and output time series
                     loss_fn=torch.nn.MSELoss(),  # used loss function
                     likelihood=None,  # not needed?!
                     torch_metrics=None,  # see  https://torchmetrics.readthedocs.io/en/latest/
                     optimizer_cls=torch.optim.Adam,  # used optimizer, Pytorch optimizer class
                     optimizer_kwargs=None,  # e.g., {"lr": 1e-3}
                     lr_scheduler_cls=None,  # None = constant learning rate, otherwise PyTorch LR scheduler class
                     lr_scheduler_kwargs=None,  # e.g., {"step_size": 50, "gamma": 0.5}
                     batch_size=1,  # how many samples per batch to load
                     n_epochs=1,  # how many epochs to train max
                     model_name=model_name,  # name of the model
                     work_dir="models/",  # dir to store tensorboard logs and checkpoints
                     log_tensorboard=True,  # logs tensorboard information at "{work_dir}/darts_logs/{model_name}/logs/"
                     nr_epochs_val_period=1,  # how often the model is evaluated
                     force_reset=False,
                     # if True, deletes all previous checkpoints and tensorboard logs with the given name
                     save_checkpoints=True,
                     # if True, saves checkpoints at "{work_dir}/darts_logs/{model_name}/checkpoints/"
                     add_encoders=None,
                     random_state=None,
                     pl_trainer_kwargs=pl_trainer_kwargs,  # parameters for the pytorch lightning trainer
                     show_warnings=False  # we do not need that ;)
                     )

    ###---TRAIN MODEL---###
    model.fit(train, val_series=test, verbose=True)

    ###---SAVE MODEL---###
    model.save(save_path)

    return True

What am I doing wrong?

The text was updated successfully, but these errors were encountered:

dennisbader · 2023-11-23T11:04:32Z

Thanks @ChristophKarlHeck for raising this issue. It is indeed a bug and will be fixed with #2088.

ChristophKarlHeck changed the title ~~[Question] How to add a Pytorch module to RNNModel~~ [Question] How to pass a Pytorch module to the constructor of the RNNModel Nov 22, 2023

madtoinou added the question Further information is requested label Nov 22, 2023

dennisbader added bug Something isn't working and removed question Further information is requested labels Nov 23, 2023

dennisbader mentioned this issue Nov 23, 2023

fix custom module for RNNModel and add tests #2088

Merged

dennisbader closed this as completed in #2088 Dec 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] How to pass a Pytorch module to the constructor of the RNNModel #2082

[Question] How to pass a Pytorch module to the constructor of the RNNModel #2082

ChristophKarlHeck commented Nov 22, 2023 •

edited

Loading

dennisbader commented Nov 23, 2023

[Question] How to pass a Pytorch module to the constructor of the RNNModel #2082

[Question] How to pass a Pytorch module to the constructor of the RNNModel #2082

Comments

ChristophKarlHeck commented Nov 22, 2023 • edited Loading

dennisbader commented Nov 23, 2023

ChristophKarlHeck commented Nov 22, 2023 •

edited

Loading