Skip to content

bad forward .ark file output when out model is a sequential model #235

Open
@timolohrenz

Description

@timolohrenz

I ran into an issue when I wanted to decode the outputs of a sequential model. Kaldi struggles to open the written .ark files for decoding throwing the following message:

ERROR (latgen-faster-mapped-parallel[5.5.646~1-cdf2]:DecodableMatrixScaledMapped():decoder/decodable-matrix.h:55) DecodableMatrixScaledMapped: mismatch, matrix has 1 cols but transition-model has 1992 pdf-ids.

So I traced it down to the following line in the core.py

out_save = outs_dict[forward_outs[out_id]].data.cpu().numpy()

where the out_save array still contains the singleton dimension from the batchsize of the forwarded sequence model. Usually this is not the default, as all PyTorch-Kaldi models implement e.g. the softmax function as an additional own model. In my case softmax is included into the model class of the sequential model and this is where it goes wrong.

I just included a small fix where I squeeze out the redundant dimension from out_save with np.squeeze but this has to be tested before I can make a pull request.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions