Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training From Scratch #120

Open
mullenj opened this issue Oct 10, 2019 · 3 comments
Open

Training From Scratch #120

mullenj opened this issue Oct 10, 2019 · 3 comments

Comments

@mullenj
Copy link

mullenj commented Oct 10, 2019

I've been having issues training this model from scratch. My model learns to output all background. Although on some training iterations it learned to predict background plus one or two of the 20 pascal voc classes. I have checked my input images and my masks. The images are imported, resized to (512,512,3), divided by 255.0 to get a range from 0-1. Masks are generated by taking the Pascal PNGs and making it a one hot (512,512,21) np array that I have also checked for accuracy. I am not using the ignore labels and treat them like background.

When I import the model I set weights to none to train from scratch, use the xception backbone, and use a softmax activation to work with my Keras loss. I compile with SGD(learning_rate=0.001, momentum = 0.9) and loss = 'categorical_crossentropy'.

The model shows an accuracy over 75% after training for about 20 epochs and a fairly low loss but when I go to test, any argmax of the network output usually produces effectively an all background prediction.

Has anyone had this issue? If not, can anyone who has trained from scratch share the hyperparameters they used or what optimizers and losses they use. Am I making some fundamental mistake with the way I am gathering my data or instantiating the net? Please let me know.

Below are my functions for getting a batch of images and masks (usually used inside of a custom data generator) img_dir points to the JPEG image folder of the pascal dataset. seg_mask_dir points to the folder with the png annotation files.

def get_ims(ids):
ims = []
for idx in ids:
path = os.path.join(img_dir, idx + ".jpg")
im = imageio.imread(path)
im = skimage.transform.resize(im, (512,512), order = 0, preserve_range=True, anti_aliasing=False)
im = np.array(im)
im = im / 255.0
ims.append(im)
return np.array(ims)

def get_gt_masks(ids):
gt_masks = []
for idx in ids:
path = os.path.join(seg_mask_dir, idx + ".png")
gt_mask = np.array(cv.imread(path))
gt_mask = gt_mask[:,:,0]
mask = np.zeros((len(gt_mask[:,0]), len(gt_mask[0,:]), len(class_list)))
for index in range(len(class_list)):
mask[:,:,index] = np.where(gt_mask == index, 1, 0)
mask = skimage.transform.resize(mask, (512,512), order = 0, preserve_range=True, anti_aliasing=False)
mask[:,:,0] = mask[:,:,0] + np.where(np.equal(0, np.max(mask, axis=2)), 1, 0)
gt_masks.append(mask)
return np.array(gt_masks)

@MatthiasSchinzel
Copy link

I had a similar problem. The loss and metrics on training were fine. However, for validation and testing the output was almost always zero and loss and metrics not good at all.

Solution: I reduced the batch size during training to one and it worked.

Why? I don't know. First time I see this kind of problem.

@Po-Hsuan-Huang
Copy link

@mullenj You need to remove the background class in your loss function otherwise the model will predict 0 everywhere (background class) which will give it around 75% accuracy.

@yongyan123
Copy link

can anyone please help me with the loss function? my train label has the shape of [256,256,21], and the output of the model is also [256,256,21]. How do i write the loss function? thanks a lot!

i've tried this:
def sparse_crossentropy_ignoring_last_label(y_true, y_pred):
nb_classes = K.int_shape(y_pred)[-1]
y_true = K.one_hot(tf.cast(y_true[:, :, 0],tf.int32), nb_classes + 1)[:, :, :-1]
y_pred = K.one_hot(tf.cast(y_pred[:, :, 0],tf.int32), nb_classes + 1)[:, :, :-1]
return K.categorical_crossentropy(y_true, y_pred)

but i got an error:
ValueError: No gradients provided for any variable: ['Conv/kernel:0', 'Conv_BN/gamma:0', ............

it seems not working

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants