Skip to content

Consider separately pass each activation and direction's weights&bias for lstm and gru #751

Open
@philloooo

Description

@philloooo

This is feedback from when trying to implement gru/lstm on CoreML driven by #689.
The biases and weights are stacked together for forward and backward directions when it's bidirectional, similarly activations are passed as an array instead of distinct separate values params.

I think it's more explicit and cleaner to follow the CoreML's design which:

  • Pass bias & weights for each direction separately when it's bidrectional
  • Pass activations separately for recurrent_activation, cell_activation, activation.

What do you think?

This also helps to unblock the lstm/gru implementation on CoreML from depending on the outcome of MLConstantOperand discussion.

@fdwr @huningxin

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions