You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The kludge of forcing the derivative and computed learning rate to be the smallest possible representation in the current fixed point precision may result in instability for low fixed point precision values.
I'm more aware of this problem as it relates to the computed learning rate, i.e.,
computed learning rate = learning rate / # items in a batch
As the number of items in a batch increases, this results in an ever-smaller learning rate. However, for low fixed point precisions (e.g., 7 bits), we can only use a batch size of 64 to allow for a reasonable learning rate of 0.5. Allowing the number of batch items to increase substantially beyond this causes problems. For example, a 7-bit fractional representation with 2048 batch items, the minimum learning rate that we can represent is 16. This is nearly guaranteed to cause instability.
There are a couple of ways to get around this:
Use a larger internal precision to deal with learning rate computations
Limit the batch size to prevent artificially increasing the learning rate
fann-xfiles will currently fail if this behavior is detected, but that is not a legitimate solution.
The text was updated successfully, but these errors were encountered:
The kludge of forcing the derivative and computed learning rate to be the smallest possible representation in the current fixed point precision may result in instability for low fixed point precision values.
I'm more aware of this problem as it relates to the computed learning rate, i.e.,
computed learning rate = learning rate / # items in a batch
As the number of items in a batch increases, this results in an ever-smaller learning rate. However, for low fixed point precisions (e.g., 7 bits), we can only use a batch size of 64 to allow for a reasonable learning rate of 0.5. Allowing the number of batch items to increase substantially beyond this causes problems. For example, a 7-bit fractional representation with 2048 batch items, the minimum learning rate that we can represent is 16. This is nearly guaranteed to cause instability.
There are a couple of ways to get around this:
Use a larger internal precision to deal with learning rate computations
Limit the batch size to prevent artificially increasing the learning rate
fann-xfiles
will currently fail if this behavior is detected, but that is not a legitimate solution.The text was updated successfully, but these errors were encountered: