How is Pytorch’s binary_cross_entropy_with_logits function related to sigmoid and binary_cross_entropy

Yang Zhang
2 min readOct 16, 2018

This notebook breaks down how binary_cross_entropy_with_logits function (corresponding to BCEWithLogitsLoss used for multi-class classification) is implemented in pytorch, and how it is related to sigmoid and binary_cross_entropy.

Link to notebook:

import torch
import torch.nn as nn
import torch.nn.functional as F

Simuated x variable:

batch_size, n_classes = 10, 4
x = torch.randn(batch_size, n_classes)
x.shape

Out:

torch.Size([10, 4])

Run:

x

Out:

tensor([[ 2.3611, -0.8813, -0.5006, -0.2178],
[ 0.0419, 0.0763, -1.0457, -1.6692],
[-1.0494, 0.8111, 1.5723, 1.2315],
[ 1.3081, 0.6641, 1.1802, -0.2547],
[ 0.5292, 0.7636, 0.3692, -0.8318],
[ 0.5100, 0.9849, -1.2905, 0.2821],
[ 1.4662, 0.4550, 0.9875, 0.3143],
[-1.2121, 0.1262, 0.0598, -1.6363],
[ 0.3214, -0.8689, 0.0689, -2.5094],
[ 1.1320, -0.6824, 0.1657, -0.0687]])

Simuated y variable:

target = torch.randint(n_classes, size=(batch_size,), dtype=torch.long)
target

Out:

tensor([1, 1, 3, 0, 2, 0, 2, 2, 1, 2])

Run:

y = torch.zeros(batch_size, n_classes)
y[range(y.shape[0]), target]=1
y

Out:

tensor([[0., 1., 0., 0.],
[0., 1., 0., 0.],
[0., 0., 0., 1.],
[1., 0., 0., 0.],
[0., 0., 1., 0.],
[1., 0., 0., 0.],
[0., 0., 1., 0.],
[0., 0., 1., 0.],
[0., 1., 0., 0.],
[0., 0., 1., 0.]])

sigmoid + binary_cross_entropy

Run:

def sigmoid(x): return (1 + (-x).exp()).reciprocal()
def binary_cross_entropy(input, y): return -(pred.log()*y + (1-y)*(1-pred).log()).mean()

pred = sigmoid(x)
loss = binary_cross_entropy(pred, y)
loss

Out:

tensor(0.7739)

F.sigmoid + F.binary_cross_entropy

The above but in pytorch:

pred = torch.sigmoid(x)
loss = F.binary_cross_entropy(pred, y)
loss

Out:

tensor(0.7739)

F.binary_cross_entropy_with_logits

Pytorch's single binary_cross_entropy_with_logits function.

F.binary_cross_entropy_with_logits(x, y)

Out:

tensor(0.7739)

For more details on the implementation of the functions above, see here for a side by side translation of all of Pytorch’s built-in loss functions to Python and Numpy.

--

--

Yang Zhang
Yang Zhang

Written by Yang Zhang

Data science and machine learning

Responses (4)