How to change the inputs of the Transformer during training (R-drop experiment)

acadaiaca · September 15, 2022, 3:39am

Hi, I’m trying to experiment paper R-drop by Opennmt-tf. R-Drop: Regularized Dropout for Neural Networks
First, during the model training, I need to repeat the input data x and concatenate them ([x; x]) in batch-size dimension.
Second, I add a R-drop loss compute function in loss.py
Third, I modify the loss compute function on training dataset. sequence_to_sequence_compute_loss. Adding a parameters about on-off R-drop, when using R-drop, the funtion will return R-drop loss instrad of cross entropy.
But I can’t figured out how to repeat the input data x and concatenate them ([x; x]) in batch-size dimension. Moreover, if I want to use multi GPU devices, the situation becomes more complicated.

guillaumekln · September 15, 2022, 8:04am

Hi,

You probably want to make this change in the model call method? Here you can access the features and labels which should correspond to x and y.

github.com

OpenNMT/OpenNMT-tf/blob/v2.28.0/opennmt/models/sequence_to_sequence.py#L157-L185


      
          def call(self, features, labels=None, training=None, step=None):
              # Encode the source.
              source_length = self.features_inputter.get_length(features)
              source_inputs = self.features_inputter(features, training=training)
              encoder_outputs, encoder_state, encoder_sequence_length = self.encoder(
                  source_inputs, sequence_length=source_length, training=training
              )
          
          
    outputs = None
              predictions = None
          
          
    # When a target is provided, compute the decoder outputs for it.
              if labels is not None:
                  outputs = self._decode_target(
                      labels,
                      encoder_outputs,
                      encoder_state,
                      encoder_sequence_length,
                      step=step,
                      training=training,

This file has been truncated. show original

You can apply the concatenation to all inner features using something like this:

features = tf.nest.map_structure(lambda x: tf.concat([x, x], 0), features)

acadaiaca · September 15, 2022, 9:57am

Thank you very much! I will try it.

LeeJodie · October 27, 2022, 8:04am

Hello, have you figured it out? I am also working on how to train a mode using R-drop .