Issue with Resnet18 Implementation #8

simomagi · 2021-04-22T13:04:31Z

I think that something is missed in the Resnet18 implementation:

  class BasicBlock (nn.Module):
	expansion = 1

	def __init__(self, in_planes, planes, stride=1, config={}):
		super(BasicBlock, self).__init__()
		self.conv1 = conv3x3(in_planes, planes, stride)
		self.conv2 = conv3x3(planes, planes)

		self.shortcut = nn.Sequential()
		if stride != 1 or in_planes != self.expansion * planes:
			self.shortcut = nn.Sequential(
				nn.Conv2d(in_planes, self.expansion * planes, kernel_size=1,
						  stride=stride, bias=False),
			)
		self.IC1 = nn.Sequential(
			nn.BatchNorm2d(planes),
			nn.Dropout(p=config['dropout'])
			)

		self.IC2 = nn.Sequential(
			nn.BatchNorm2d(planes),
			nn.Dropout(p=config['dropout'])
			)

	def forward(self, x):
		out = self.conv1(x)
		out = relu(out)
		out = self.IC1(out)

		out += self.shortcut(x)
		out = relu(out)
		out = self.IC2(out)
		return out

Two attributes are defined in the class BasicBlock. They represent the classical convolution operations that are used for Resnet-X architectures:

self.conv1 = conv3x3(in_planes, planes, stride)
self.conv2 = conv3x3(planes, planes)

The problem is that in the forward pass only the first one (i.e. self.conv1) and the two BatchNormalization Layers are used to compute the output. Furthermore, when the model is load in the gpu both conv1 and conv2 are moved into it and the second one is unused in the forward. So i think that the code of the forward pass should be:

def forward(self, x):
          out = self.conv1(x)
          out = relu(out)
          out = self.IC1(out)
          out = self.conv2(out)
          out = self.IC2(out)
          
          out += self.shortcut(x)
          out = relu(out)
          return out

`
Fixing this problem, i am not able to reproduce the results of the paper. On cifar 100 using the hyperparameters provided in this closed issue :

--dataset cifar100 --tasks 20 --epochs-per-task 1 --lr 0.15 --gamma 0.85 --batch-size 10 --dropout 0.1 --seed 1234

average accuracy = 51.339999999999996, forget = 0.11200000000000002

If use instead the hyperparameters provided in replicate_experiment_2.sh

--dataset cifar100 --tasks 20 --epochs-per-task 1 --lr 0.1 --gamma 0.8 --hiddens 256 --batch-size 10 --dropout 0.5 --seed 1234

average accuracy = 44.550000000000004, forget = 0.05421052631578946

These results differ much from the results reported in the paper:

average_accuracy=59.9 and forgetting=0.08

Could you provide the correct hyperparameters?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with Resnet18 Implementation #8

Issue with Resnet18 Implementation #8

simomagi commented Apr 22, 2021 •

edited

Loading

Issue with Resnet18 Implementation #8

Issue with Resnet18 Implementation #8

Comments

simomagi commented Apr 22, 2021 • edited Loading

simomagi commented Apr 22, 2021 •

edited

Loading