8.8 Adding more layers

Slides

It is also possible to add more layers between the vector representation layer and the output layer to perform intermediate processing of the vector representation. These layers are the same dense layers as the output but the difference is that these layers use relu activation function for non-linearity.

Like learning rates, we should also experiment with different values of inner layer sizes:

# Function to define model by adding new dense layer
def make_model(learning_rate=0.01, size_inner=100): # default layer size is 100
    base_model = Xception(weights='imagenet',
                          include_top=False,
                          input_shape=(150,150,3))

    base_model.trainable = False
    
    #########################################
    
    inputs = keras.Input(shape=(150,150,3))
    base = base_model(inputs, training=False)
    vectors = keras.layers.GlobalAveragePooling2D()(base)
    inner = keras.layers.Dense(size_inner, activation='relu')(vectors) # activation function 'relu'
    outputs = keras.layers.Dense(10)(inner)
    model = keras.Model(inputs, outputs)
    
    #########################################
    
    optimizer = keras.optimizers.Adam(learning_rate=learning_rate)
    loss = keras.losses.CategoricalCrossentropy(from_logits=True)

    # Compile the model
    model.compile(optimizer=optimizer,
                  loss=loss,
                  metrics=['accuracy'])
    
    return model

Next, train the model with different sizes of inner layer:

# Experiement different number of inner layer with best learning rate
# Note: We should've added the checkpoint for training but for simplicity we are skipping it
learning_rate = 0.001

scores = {}

# List of inner layer sizes
sizes = [10, 100, 1000]

for size in sizes:
    print(size)
    
    model = make_model(learning_rate=learning_rate, size_inner=size)
    history = model.fit(train_ds, epochs=10, validation_data=val_ds)
    scores[size] = history.history
    
    print()
    print()

Note: It may not always be possible that the model improves. Adding more layers mean introducing complexity in the model, which may not be recommended in some cases.

In the next section, we'll try different regularization technique to improve the performance with the added inner layer.

Notes

Add notes from the video (PRs are welcome)

softmax takes raw scores from a dense layer and transforms it into a probability
activation functions used for output vs activation functions used for intermediate steps
have a look at http://cs231n.stanford.edu/2017/
sigmoid: negativ input --> zero, positive input --> straight line
relu
softmax

⚠️	The notes are written by the community. If you see an error here, please create a PR with a fix.

Notes from Peter Ernicke

Navigation

Machine Learning Zoomcamp course
Session 8: Neural Networks and Deep Learning
Previous: Checkpointing
Next: Regularization and dropout

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

08-more-layers.md

08-more-layers.md

8.8 Adding more layers

Notes

Navigation

Files

08-more-layers.md

Latest commit

History

08-more-layers.md

File metadata and controls

8.8 Adding more layers

Notes

Navigation