You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Because absolute value function is not differentiable at point x=0, so it is subgradient instead of gradient. But in practice, the weight x never becomes 0 so it is actually equivalent to gradient.
Unlike Pytorch, in Torch there is no automatic differentiation, so I found this to be the most convenient way to do the thing we wanted, and we just used it.
L1 sparsity should be torch.abs(weight), can you detail more about it?
local subgradient = S*torch.sign(weight)
The text was updated successfully, but these errors were encountered: