Custom Loss Function Criterion

Hi All! Sorry in advance for the incomplete links. I’m a new user and the system wouldn’t let me have them, but i still wanted to reference what i was looking at, so both for OpenNMT-py and pyTorch, just prepend

I’m trying to build a custom Loss function to use on an OpenNMT-py model. I’ve seen this post where there are some references for the Lua implementation, but i’m finding i cannot follow through for python.

If i look at the Loss module @ OpenNMT/OpenNMT-py/blob/master/onmt/, it seems that since i don’t need to change the overall logic, i would only need to change the crit input to that object. In order to do so, i’ve looked into the MNTCriterion @ OpenNMT/OpenNMT-py/blob/master/onmt/ to use as a template.

Now is where my troubles begin:

  1. I go to where nn.NLLLoss @ OpenNMT/OpenNMT-py/blob/master/onmt/ is defined, i get deep into torch modules.
  2. I can follow through up to the definition of forward:
def forward(self, input, target):
        return F.nll_loss(input, target, self.weight, self.size_average,
  1. Now i try to understand what F.nll_loss @ pytorch/pytorch/blob/master/torch/nn/modules/ does, so i go to it’s declaration @ pytorch/pytorch/blob/master/torch/nn/
def nll_loss(input, target, weight=None, size_average=True, ignore_index=-100):
    r"""The negative log likelihood loss.
    See :class:`~torch.nn.NLLLoss` for details.
        input: :math:`(N, C)` where `C = number of classes` or `(N, C, H, W)`
            in case of 2D - Loss
        target: :math:`(N)` where each value is `0 <= targets[i] <= C-1`
        weight (Variable, optional): a manual rescaling weight given to each
            class. If given, has to be a Variable of size "nclasses"
        size_average (bool, optional): By default, the losses are averaged
            over observations for each minibatch. If size_average
            is False, the losses are summed for each minibatch. Default: True
        ignore_index (int, optional): Specifies a target value that is ignored
            and does not contribute to the input gradient. When size_average is
            True, the loss is averaged over non-ignored targets. Default: -100
        >>> # input is of size nBatch x nClasses = 3 x 5
        >>> input = autograd.Variable(torch.randn(3, 5))
        >>> # each element in target has to have 0 <= value < nclasses
        >>> target = autograd.Variable(torch.LongTensor([1, 0, 4]))
        >>> output = F.nll_loss(F.log_softmax(input), target)
        >>> output.backward()
    dim = input.dim()
    if dim == 2:
        return _functions.thnn.NLLLoss.apply(input, target, weight, size_average, ignore_index)
    elif dim == 4:
        return _functions.thnn.NLLLoss2d.apply(input, target, weight, size_average, ignore_index)
        raise ValueError('Expected 2 or 4 dimensions (got {})'.format(dim))
  1. and for the life of me, i know not how to read what that does. I can try to drill down on my IDE, or step through on debug mode, but that only gets worse, as i arrive to something like this:
    def forward(ctx, input, target, *args):
        ctx._backend = type2backend[type(input)]
        ctx.save_for_backward(input, target)
        if weight_arg_idx >= 0:
            ctx.weight = args[0]
            args = args[1:]
            ctx.additional_args = list(args)
            insert_idx = weight_arg_idx - 4  # state, input, target, output
            ctx.additional_args.insert(insert_idx, ctx.weight)
            ctx.additional_args = list(args)

        ctx.forward_args_count = len(ctx.additional_args)
        for idx in buffers_idx:
        output =
        getattr(ctx._backend,, input, target,
                                                  output, *ctx.additional_args)
        return output

All i want to do is find the template from which the criterion is effectively calculated so i can create a modified copy that suits my needs.

Anyone have an idea of where and how would this be done?


1 Like

Hi @juanshlm, were you able to figure out what you were trying to figure out with this?

Yes! I solved it within pytorch’s forum!

Can you share the link to thread where you found the solution? I am also stuck in same situation.

@groverpr this is the thread:

But the links are outdated. I could find this on an older branch:

If you find the new location, please comment either here or on the pytorch thread!

1 Like