Hello,
I need some precision about what a decoder (such as the SelfAttention one) is taking in input, in your example your wrote
https://github.com/OpenNMT/OpenNMT-tf/blob/1192c8a01b3ab91b09df8b1e80a4b1b5ec570dbc/examples/library/minimal_transformer_training.py#L89
Here your give target_embedding
as inputs
for your decoder.decode()
and the memory
parameter is containing the output of the encoder, which is optional
Could you explain why the memory
parameter is not mandatory, don’t we always need something to be decoded ?
And why the inputs
parameter is mandatory, because it contains the target sentence embeddings and sometimes you don’t have a target sentence (at translation time for example) ?
Thanks
Hi,
Could you explain why the memory
parameter is not mandatory, don’t we always need something to be decoded ?
Decoders can be used for language modelling where memory
does not exist.
And why the inputs
parameter is mandatory, because it contains the target sentence embeddings and sometimes you don’t have a target sentence (at translation time for example) ?
The decode
method expects the full target sequence and is used for training. The inference method dynamic_decode_and_search
just takes the embedding matrix and the start ids.