adding a 'definition' attribute (or method) to the class Model #836

bertinets · 2023-01-30T14:08:46Z

bertinets
Jan 30, 2023

I think it would be useful to have a string representation of the mathematical definition of the model, especially for the built-in models, so that one could quickly check what is the functional form of the expression used to generate the model.
For example:

import lmfit
model = lmfit.models.lmfit_models['Exponential']()
model.definition

would return the string:
'amplitude* e^(-x/decay)'

or

import lmfit
model = lmfit.models.lmfit_models['Thermal Distribution'](form='maxwell')
model.definition

would return the string:
'1/(amplitude*exp((x - center)/(kt)))'

Also, I am bringing this up because I am using lmfit in combination with sympy, which I use to calculate the analytic jacobian of the fitting model that in turn can be used to speed up least_squares. For this sake, at the moment, I have taken from the lmfit documentation of the line shapes definition the expressions of all the models and transformed them in a string that can be used to generate a symbolic function using sympy.sympify.

It would be of course easier if one could do that in a single step like for example:

import lmfit
model = lmfit.models.lmfit_models['Thermal Distribution'](form='maxwell')
sym_function = sympy.sympify(model.definition)

Or maybe this is already possible with the current version of lmfit? I have read the documentation and the code quite extensively and I couldn't find a way to generate the model's mathematical expression as a string.

newville · 2023-02-01T04:07:05Z

newville
Feb 1, 2023
Maintainer

@bertinets What haapens if the "definition" is more complex than a single line? Even a little more complex (say, RectangleModel) would be very hard to render or use for automated differentiation. The function definitions are expected to be in Python, not parseable mathematical strings.

It should also be mentioned that trying to use sympy to construct analytic derivatives would also need to include the transformations used for bounded variables. That is probably possible to do that, but I would not guess it is easy.

1 reply

bertinets Feb 1, 2023
Author

@newville Yes, you are right that for the LinearModel and RectangleModel in their 'linear' form it would require a little more work. I think in that case one could use piecewise to address the issue.

For example, for the linear step (and similarly for the rectangle):
Piecewise((0, (x - center)/sigma < -0.5), (amplitude*(0.5+(x - center)/sigma), And((x - center)/sigma>=-0.5, (x - center)/sigma <= 0.5)),(amplitude, (x - center)/sigma>0.5))

And this would also be parsable and differentiable by sympy. Of course, the differentiation would be a piecewise function as well, so it may not be used for the jacobian (although I didn't try yet).

trying to use sympy to construct analytic derivatives would also need to include the transformations used for bounded variables

This is so very right and I'm currently struggling with finding a way to implement that.

newville · 2023-02-02T16:36:17Z

newville
Feb 2, 2023
Maintainer

@bertinets But, how would you know that you need to use "Piecewise"? What if the definition includes a function that sympy does not recognize or identifies incorrectly?

Like, what would your plan be for "Voigt", defined as "amplitudewofz(z).real / (sigmas2pi)", or "Pearson4" or "Pearson7" which use other functions from scipy but not provided in numpy?

What if the function is more than a one-line expression?

Perhaps if you show that using a sympy-generated analytic Jacobian can work, and is faster for a very simple like Gaussian (including the bound sigma > 0), then we can worry about how to make that more easily generalizable.

0 replies

bertinets · 2023-02-04T16:29:27Z

bertinets
Feb 4, 2023
Author

@newville
About the last point you raise.
I started working with simply in combination with least_squares about 2 years ago, as our group collects often large 4d data (2d maps of 2d x-ray scattering profiles) from synchrotron radiation sources of the last generations and we often have to fit > 10^5 integrated profiles per map. I realised that using the analytic derivatives is much faster than using numerical approximations and that's fundamental for us to get good results in a reasonable time. Here an example, inspired by the lmfit documentation.

import lmfit
import numpy as np
import sympy
from scipy.optimize import least_squares
from functools import partial

def fit_res(f,prm,x,y):
    funct=partial(f,*prm)(x=x)-y
    return funct

def jacmat_calc(jacb,pidx,pval,x,y=None,**kwargs):
    J = np.empty((len(x), len(pval)))
    for i,ja in enumerate(jacb):
        pv = [pval[p] for p in pidx[i]]
        J[:,i]=ja(x,*pv)
    return J

def step_fun(center,amplitude,sigma,slope,intercept,x):
    return amplitude * (1 - 1/(1 + np.exp((x - center)/sigma))) + slope*x + intercept

xv = np.linspace(0, 10, 45)
yv = np.ones_like(xv)
yv[:12] = 0.0
yv[12:15] = np.arange(15-12)/(15.0-12)
np.random.seed(0)
yv = 110.2 * (yv + 9e-2*np.random.randn(xv.size)) + 12.0 + 2.22*xv

p0 = {
      'center':2.5,
      'amplitude':100,
      'sigma':0.5,
      'slope':1,
      'intercept':10
      }

fit_kwargs = {}
fit_kwargs['jac']='3-point'
fit_kwargs['args']=(xv, yv)
fit_kwargs['method']='trf'
fit_kwargs['loss']='soft_l1'
fit_kwargs['max_nfev']=500*len(p0)
fit_kwargs['ftol']=1e-8
fit_kwargs['xtol']=1e-8
fit_kwargs['gtol']=1e-8

res_3p = least_squares(partial(fit_res,step_fun), [p0[k] for k in p0.keys()],  **fit_kwargs)

The optimisation problem is solved correctly and running %timeit least_squares(partial(fit_res,step_fun), [p0[k] for k in p0.keys()], **fit_kwargs) returns 644 ms ± 10.9 ms per loop.
The main reason for such a slow computation is that res_3p.nfev = 1455. Running the optimisation with a 2 point approximated jacobian takes about half a second and 1322 evaluations.

With an exact (analytical) jacobian things change dramatically:

# declaring the independent variable for sympy
x=sympy.Symbol('x')

lmfit_expr = {
                'Linear': 'slope*x + intercept',
                'Step': 'amplitude * (1 - 1/(1 + exp((x - center)/sigma)))',
    }

# now the string in this line is the one could get from the lmfit_expr above, that could be called as an attribute of the 2 lmfit models
mod_fun_sp = sympy.sympify('amplitude * (1 - 1/(1 + exp((x - center)/sigma))) + slope*x + intercept')

# list of symbols used in the fitting equation   
ssymb = [s for s in list(mod_fun_sp.free_symbols) if str(s)!='x']

# calculate the symbolic Jacobian of the symbolic function f (i.e. calculate the list of df/dprms_i)
symjac = [mod_fun_sp.diff(p) for p in ssymb]

# convert the list of symbolic jacobian terms in a list of normal scalar functions 
jacb = []
jacslist=[]
for sj in symjac:
    jactmp = [s for s in sj.free_symbols if str(s)!='x']
    if len(jactmp)>0:
        jactmp.insert(0,sympy.Symbol('x'))
        jtmp = sympy.lambdify(jactmp,sj) 
        jacb.append(jtmp)
        jacslist.append(jactmp)
    else:
        x=sympy.Symbol('x')
        jtmp = sympy.lambdify(x,sj)
        jacb.append(jtmp)
        jacslist.append([x])

# create the list of arguments to be passed to the symbolic fitting function
fitargs=ssymb+[x]

# convert the symbolic fitting function into a normal python function
fitmodel=sympy.lambdify(fitargs,mod_fun_sp)
# create the guess for the lambdified function created by sympy. Note that the order of the parameter is different than the one for the python function
p0sp = [p0[str(s)] for s in ssymb]

# for each of the symbolic Jacobian equations, we require the index of the parameters list p0sp associated to the symbols
pidx=[]
p0k = [str(s) for s in ssymb]
for jsl in jacslist:
    pid = []
    for s in jsl:
        if str(s) != 'x':
            pid += [p0k.index(str(s))]
    pidx.append(pid)

# calculate the jacobian callable for least_squares
jac = partial(jacmat_calc,jacb,pidx)

# adjust the kwargs to use the analytical jacobian
fit_kwargs['jac']=jac
fit_kwargs['x_scale']='jac'

res_sp = least_squares(partial(fit_res,fitmodel), p0sp,  **fit_kwargs)

Now %timeit least_squares(partial(fit_res,step_fun), p0sp, **fit_kwargs) returns: 4.71 ms ± 51.4 µs per loop and res_sp.nfev=29.

In this case (of course this cannot be generalised to all of the cases) there is more than a 100 fold speed increase in running the optimisation with the exact jacobian. Note that this exact jacobian can also be used to calculate exactly (rather than estimate with a numerical evaluation) the covariance matrix for the estimation of the errors on the fitting parameters.

Going back to the other 2 points:

for the Voigt function, here is how I have already implemented it:
sympy.sympify('Real(exp(-((x-center +i*alpha)/(sigma*sqrt(2)))^2))*erfc(-i*((x-center +i*alpha)/(sigma*sqrt(2))))')
Piecewise should be used any time there is a if-then condition in the function definition.

Anyway, I actually raised the initial point mostly because I think it would be useful for a user to have access to the mathematical definition used to build the model function with a simple call to a method or an attribute of the class. I am happy to provide the strings-like expression if you think it would be a good idea to have this implemented (I already have them for 80% of the lmfit models). Of course if the string would be parsable by sympy it would be a plus for me (not sure how many other users would use the string the way I would).

1 reply

bertinets Feb 4, 2023
Author

Just to add on the previous post.
I just found out that I cannot reproduce the results I obtain with least_squares with the equivalent implementation of lmfit, so I'm probably doing something wrong.
After running a fit with scipy, as in the previous post, here is the fit I get:

Now, if I try using least_squares within lmfit, as follow (using the definitions and imports from the previous post):

def fit_res_lmfit(f,prm,x,y):
    funct=partial(f,**prm)(x=x)-y
    return funct

# definition of the lmfit model
step_mod = lmfit.models.StepModel(form='logistic', prefix='step_')
line_mod = lmfit.models.LinearModel(prefix='line_')
mod = step_mod + line_mod
#paramters
modp = mod.make_params()

# initialise of the values of the parameter with the values as in the previous post
modp['step_center'].value = p0['center']
modp['step_amplitude'].value = p0['amplitude']
modp['step_sigma'].value = p0['sigma']
modp['line_intercept'].value = p0['intercept']
modp['line_slope'].value = p0['slope']

# need to remove the keyword 'args' from the fit_kwargs dict
fit_kwargs.pop('args')
fit_kwargs['jac']='3-point'
fit_kwargs['x_scale']=1.

res_lmfit_3p = lmfit.Minimizer(partial(fit_res_lmfit,mod.eval), modp, scale_covar=True, fcn_args=(xv,yv),**fit_kwargs).minimize(method='least_squares')

A %timeit says that the fit runs in 12 ms. However, by plotting the results with plt.plot(xv,yv,'.',xv,mod.eval(res_lmfit_3p.params,x=xv)), gives:

Similarly, if trying to use an analytic jacobian, as follows:

#definition of the symbolic functions using lmfit expressions as from the previous post
st_sp = sympy.sympify(lmfit_expr['Step'])
ln_sp = sympy.sympify(lmfit_expr['Linear'])
mod_fun_sp = st_sp + ln_sp

# list of symbols for sympy
ssymb = [s for s in list(mod_fun_sp_lmfit.free_symbols) if str(s)!='x']

# give names to the symbols consistent to the lmfit parameters name
for i,s in enumerate(ssymb):
    for p in modp:
        if str(s) in p:
            mod_fun_sp_lmfit =  mod_fun_sp_lmfit.subs(s,sympy.Symbol(p))
            ssymb[i] = sympy.Symbol(p)

# calculate the symbolic Jacobian of the symbolic function f (i.e. calculate the list of df/dprms_i)
symjac = [mod_fun_sp_lmfit.diff(p) for p in ssymb]

# convert the list of symbolic jacobian terms in a list of normal scalar functions 
jacb = []
jacslist=[]
for sj in symjac:
    jactmp = [s for s in sj.free_symbols if str(s)!='x']
    if len(jactmp)>0:
        jactmp.insert(0,sympy.Symbol('x'))
        jtmp = sympy.lambdify(jactmp,sj) 
        jacb.append(jtmp)
        jacslist.append(jactmp)
    else:
        x=sympy.Symbol('x')
        jtmp = sympy.lambdify(x,sj)
        jacb.append(jtmp)
        jacslist.append([x])

# create the list of arguments to be passed to the symbolic fitting function
fitargs_lmfit=ssymb+[x]

# convert the symbolic fitting function into a normal python function
fitmodel_lmfit=sympy.lambdify(fitargs_lmfit,mod_fun_sp_lmfit)

# as before we need to determine the name of the parameters (first iteration) and their order (from the second iteration) for the arguments of the jacobian matrix
jacmp=[]
pord = []
for jsl in jacslist:
    pid = []
    pn = []
    for s in jsl:
        if str(s) != 'x':
            for i,p in enumerate(modp):
                if str(s) in p:
                    pid+=[p]
                    pn+=[i]
    jacmp.append(pid)
    pord.append(pn)

def jacmat_calc_lmfit(jacb,pord,jacmp,x,modp,y=None,apply_bounds_transformation=True,**kwargs):
    J = np.empty((len(x), len(jacb)))    
    for i,(ja,jp,po) in enumerate(zip(jacb,jacmp,pord)):
        # at the first iteration the 'mode' argument is a lmfit parameters object
        if isinstance(modp, lmfit.parameter.Parameters):
            pv = [modp[name].value for name in jp]
        # from the second one it is a list of floats
        else:
            pv = [modp[k] for k in po]
        J[:,i]=ja(x,*pv)
    return J

jac_lmfit = partial(jacmat_calc_lmfit,jacb, pord, jacmp, xv)
fit_kwargs['jac']=jac_lmfit
fit_kwargs['x_scale']='jac'
res_lmfit_jac = lmfit.Minimizer(partial(fit_res_lmfit,fitmodel_lmfit), modp, scale_covar=True, fcn_args=(xv,yv),**fit_kwargs).minimize(method='least_squares')

This runs in 3.4 ms but again, and the resulting fit looks more similar the scipy version of least_squares, although not identical:

newville · 2023-02-05T04:55:21Z

newville
Feb 5, 2023
Maintainer

@bertinets I am somewhat sympathetic to the idea of trying to add analytic derivatives. Of course, it can help improve performance. Improving raw performance is good, but it is not the only consideration. And, for sure, adding analytic derivatives to lmfit is not going to be easy - they would need to manage parameter bounds and fixed parameters. I am also definitely sympathetic to people trying to fit large amounts of X-ray data, say from synchrotrons.

But: if the claim is that something is taking 1000 function evaluations, you should back that up. It is completely believable to me that your rather convoluted code is not doing exactly what you think it is. A simplified use of your code would be:

import numpy as np
from lmfit.models import StepModel, LinearModel

xv = np.linspace(0, 10, 45)
yv = np.ones_like(xv)
yv[:12] = 0.0
yv[12:15] = np.arange(15-12)/(15.0-12)
np.random.seed(0)
yv = 110.2 * (yv + 9e-2*np.random.randn(xv.size)) + 12.0 + 2.22*xv

model  = StepModel(form='logistic', prefix='step_') + LinearModel(prefix='line_')
params = model.make_params(step_center=2.5,  step_amplitude=100, step_sigma=0.5,
                                                     line_slope=1, line_intercept=20)
params['step_sigma'].min = 0

result = model.fit(yv, params, x=xv)
print(result.fit_report())
print('fit took %.4f sec' % (time.time()-t0))

which gives a good fit, and prints a report of

[[Model]]
    (Model(step, prefix='step_', form='logistic') + Model(linear, prefix='line_'))
[[Fit Statistics]]
    # fitting method   = leastsq
    # function evals   = 79
    # data points      = 45
    # variables        = 5
    chi-square         = 5089.91899
    reduced chi-square = 127.247975
    Akaike info crit   = 222.775962
    Bayesian info crit = 231.809274
    R-squared          = 0.95963179
[[Variables]]
    step_amplitude:  108.615787 +/- 7.18818788 (6.62%) (init = 100)
    step_center:     3.07401261 +/- 0.04438598 (1.44%) (init = 2.5)
    step_sigma:      0.11807927 +/- 0.03887181 (32.92%) (init = 0.5)
    line_slope:      1.18451859 +/- 1.07200584 (90.50%) (init = 1)
    line_intercept:  20.4834525 +/- 3.47815791 (16.98%) (init = 20)
[[Correlations]] (unreported correlations are < 0.100)
    C(step_amplitude, line_slope)     = -0.841
    C(step_amplitude, step_sigma)     = 0.402
    C(line_slope, line_intercept)     = -0.364
    C(step_sigma, line_slope)         = -0.322
    C(step_center, line_intercept)    = 0.277
    C(step_center, line_slope)        = -0.147
    C(step_amplitude, line_intercept) = -0.118
fit took 0.0066 sec

A good fit is found with 79 function evaluations, in under 10 msec. With a script that could be read and understood by any undergraduate. I do not know how you managed to get 1000 function evaluations.

It's OK to say that things could be optimized. It is important to be clear-eyed about the facts, and decide what is to be optimized.
With a mere week of work, I'm sure you could save milliseconds of CPU time ;). You could rewrite the whole thing in Fortran (but probably end up tossing the bounds and the constraints ;) ). I would not recommend it (been there, done that, with constraints).

If you need fits to go 10x faster, consider using 10 cores to do 10 fits at once, right?

3 replies

bertinets Feb 5, 2023
Author

@newville
Yes, I have also noticed that using model.fit the fitting goes much faster. I thought it's because, as your output reports, lmfit uses the leastsq method as default. At least when I started developing my code, leastsq didn't handle bounds. Therefore I developed my code around least_squares.
I apologise for the rudimentary code, I am definitely not a developer, and I never studied any language program, so my code definitely lacks of simplicity. I'm sure I wouldn't be able to code something in Fortran.
In this case, however, the convolution of the code comes in part from the way the jacobian must be built to be called by least_squares and in part from the fact that I had to adapt my original code to lmfit + least_squares (as the order of the arguments and their structure is different than least_squares) using the minimiser class. This is a bit more cumbersome than using the .fit methods, but I couldn't find a way to have a lmfit model to work with the least_squares method and passing the jacobian as an argument. Maybe you can point me in the right direction on how to do this properly?

If I try to run your last line to make it a bit more similar to my call above (using a soft_l1 loss, which often helps fitting for example saxs data) as follows:

t0 = time.time()
result = model.fit(yv, params, x=xv, method='least_squares', fit_kws={'loss':'soft_l1','jac':'3-point'})
print(result.fit_report())
print('fit took %.4f sec' % (time.time()-t0))

gives the following output, at least on my pc.

[[Model]]
    (Model(step, prefix='step_', form='logistic') + Model(linear, prefix='line_'))
[[Fit Statistics]]
    # fitting method   = least_squares
    # function evals   = 90
    # data points      = 45
    # variables        = 5
    chi-square         = 5157.54304
    reduced chi-square = 128.938576
    Akaike info crit   = 223.369890
    Bayesian info crit = 232.403202
    R-squared          = 0.95909546
[[Variables]]
    step_amplitude:  113.071275 +/- 24.3951998 (21.58%) (init = 100)
    step_center:     3.07114365 +/- 0.26014819 (8.47%) (init = 2.5)
    step_sigma:      0.13194287 +/- 0.27996593 (212.19%) (init = 0.5)
    line_slope:      0.55809474 +/- 3.30318362 (591.87%) (init = 1)
    line_intercept:  20.0503789 +/- 14.5597729 (72.62%) (init = 20)
[[Correlations]] (unreported correlations are < 0.100)
    C(step_amplitude, line_slope)     = -0.797
    C(step_center, step_sigma)        = 0.728
    C(line_slope, line_intercept)     = -0.414
    C(step_amplitude, step_sigma)     = 0.193
    C(step_amplitude, line_intercept) = -0.166
    C(step_center, line_slope)        = -0.158
    C(step_center, line_intercept)    = 0.150
    C(step_sigma, line_slope)         = -0.129
fit took 0.0642 sec

which is quite a bit more than 7 ms of leastsq but still much less than the 640 I get by calling least_squares... I guess I should change the way I call the function defining the residuals.

If you need fits to go 10x faster, consider using 10 cores to do 10 fits at once, right?

very right!! we normally use at least 16 cores, although the speed doesn't seem to go linear with the n of cores

Just want to emphasise that I wanted to optimise the least_square for our use cases and not much else. I found lmfit a few months ago and I saw that it does what most of my code does in a much more elegant and complete way, so I am now using lmfit for part of our fitting tasks and I have thought that having a definition method or attribute would be something other users may like. I didn't intend to suggest that lmfit should use analytical derivatives (although of course I would be over enthusiastic about it).

Anyway, thanks a lot for the discussion.

bertinets Feb 5, 2023
Author

I couldn't find a way to have a lmfit model to work with the least_squares method and passing the jacobian as an argument

Nevermind, I've found a way now.

newville Feb 5, 2023
Maintainer

Yeah, in general, using least_squares or any scalar loss function other than "least squares" will and computational expense compared to leastsq. Ironically, and horribly, and through what can only be called intentionally bad and confusing design, the scipy.optimize.least_squares gives the option to use loss functions other than "least squares" and a host of interacting options. I despise this method and would happily drop it from lmfit. It will never be the default, it will never be recommended. Unfortunately. It is the only implementation of "trf" in scipy.

Saying that you need an analytic Jacobian definitely means that you understand the mathematics and the programming implementation of the objective function very well. Like, if you change the order of the variables, the Jacobian changes. If you fix or add bounds to any parameter, the Jacobian will change. If you use a constraint, the Jacobian changes. That's not really something for novices.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lmfit

adding a 'definition' attribute (or method) to the class Model #836

{{title}}

Replies: 4 comments 5 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

lmfit

adding a 'definition' attribute (or method) to the class Model #836

bertinets Jan 30, 2023

Replies: 4 comments · 5 replies

newville Feb 1, 2023 Maintainer

bertinets Feb 1, 2023 Author

newville Feb 2, 2023 Maintainer

bertinets Feb 4, 2023 Author

bertinets Feb 4, 2023 Author

newville Feb 5, 2023 Maintainer

bertinets Feb 5, 2023 Author

bertinets Feb 5, 2023 Author

newville Feb 5, 2023 Maintainer

bertinets
Jan 30, 2023

Replies: 4 comments 5 replies

newville
Feb 1, 2023
Maintainer

bertinets Feb 1, 2023
Author

newville
Feb 2, 2023
Maintainer

bertinets
Feb 4, 2023
Author

bertinets Feb 4, 2023
Author

newville
Feb 5, 2023
Maintainer

bertinets Feb 5, 2023
Author

bertinets Feb 5, 2023
Author

newville Feb 5, 2023
Maintainer