Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Options for vector-valued functions #363

Open
baggepinnen opened this issue Aug 20, 2024 · 7 comments
Open

[FEATURE] Options for vector-valued functions #363

baggepinnen opened this issue Aug 20, 2024 · 7 comments
Labels
enhancement New feature or request

Comments

@baggepinnen
Copy link

baggepinnen commented Aug 20, 2024

Hello there 👋 I'm writing regarding the following admonition in the docs :)

image

I am solving optimal-control problems where the dynamics are encoded in the form of a ModelingToolkit model. ModelingToolkit can generate very efficient code for the right-hand side of the dynamics, i.e., I can obtain an executable function $f$ in

$$\dot x = f(x, u)$$

where $x$ and $u$ are potentially high-dimensional (length(x) == 10) in the particular example I'm working on right now).

Tracing through $f$ with JuMP variables (infinite variables) "works", with plenty of hacks, but the result is very slow (the traced expression for $\dot x$ is too large to even print to the REPL without crashing julia, while evaluating the code generated by MTK takes 1µs only). The issue is that the code emitted by MTK contains tons of common sub expressions and inlined solutions of linear systems etc ($f$ is the dynamics of a high-dimensional multibody system), this is all handled very efficiently in $f$, where memory for solving the linear systems is manually allocated and freed etc. making $f$ GC allocation free.

To avoid explosive expression growth when tracing through $f$ with JuMP variables, I replace the linear-system solves x = A\b that appear in $f$ with the following

        x = InfiniteOpt.@variables(m, begin
            [1:n], Infinite(τ) # Temporary anonymous variables
        end)[1]
        lhs = A*x
        @constraint(m, lhs .== b)
        bout .= x

i.e., I introduce temporary variables x and equality constraints A*x == b. As I said, this works, but memory requirements are sky high and I believe it should be possible to improve the performance by at least 1000x over the performance this gives me.

Coming back to the admonition, I have been thinking about different ways to work around the limitations on vector-valued nonlinear functions but haven't found any alternative that is working out for me. What approach did you have in mind with

However, this limitation can readily be removed if there is a use case for it

?

I am willing to jump through quite a few hoops to make this efficient if required :) Thanks for your time!

@baggepinnen baggepinnen added the enhancement New feature or request label Aug 20, 2024
@pulsipher
Copy link
Collaborator

pulsipher commented Aug 21, 2024

Hi there,

The quoted admonition is removed in the development version of the package that will be released relatively soon. The current version uses its own nonlinear expression system, but the new version adopts the nonlinear modelling interface provided by JuMP based on GenericNonlinearExpr. JuMP does not currently directly support vector valued user-defined functions, but there are workarounds (see https://jump.dev/JuMP.jl/stable/tutorials/nonlinear/tips_and_tricks/#User-defined-operators-with-vector-outputs).

In the longer term, adding array-valued nonlinear expression support is on the JuMP development roadmap (see jump-dev/MathOptInterface.jl#2402 and https://jump.dev/JuMP.jl/stable/developers/roadmap/#Development-roadmap). Once, this is supported by JuMP, InfiniteOpt will automatically inherit this ability as well.

For your model, I strongly suspect it will be much more performant to directly define the ODE model in InfiniteOpt (especially with the experimental InfiniteExaModels backend) and not use MTK. However, you can also try the aforementioned workaround to instead embed the MTK function as a user-defined function; in which case I suggest trying it with the development version of InfiniteOpt (i.e., Pkg.add(url = "https://github.com/infiniteopt/InfiniteOpt.jl", rev = "master"))

@baggepinnen
Copy link
Author

but there are workarounds (see https://jump.dev/JuMP.jl/stable/tutorials/nonlinear/tips_and_tricks/#User-defined-operators-with-vector-outputs).

I have tried this workaround without much success. Also, when I tried, the InfiniteOpt.@register macro crashed julia with the number of function arguments required for this system.

Ditching MTK is not an option for me in this case for two reasons

  • I'm working on coming up with general strategies to solve OC problems with MTK models for our commercial customers that are using MTK. For non-multibody MTK models the approach of tracing through the generated dynamics with InfiniteOpt/JuMP variables have worked fine, but for multibody dynamics the "everything is scalar" approach of JuMP does not scale well enough.
  • I'm developing and using a multibody component library for MTK to model this particular system. Deriving the dynamics of a 3D multibody system with 10-100s of components (including other domains such as electrical and thermal) and kinematic loops manually is extraordinarily tedious and error prone without such a library.

I have spent the day implementing a primitive trajectory-optimization package that performs the direct collocation transcription and have obtained reasonable performance, except for computing the Hessian of the Lagrangian, which is exactly where JuMP is struggling as well. Sparse forward-over reverse mode AD does best, but not quite well enough. I have tried MadDiff.jl and FastDifferentiation.jl before, and they are blazingly fast for the operations they support, but I've found the support to be too limited to be workable for generic MTK models.

I will give the InfiniteExaModels package a quick spin though to see if it improves upon the JuMP AD in this case :)

@pulsipher
Copy link
Collaborator

Thanks for the clarification on your use case. This certainly provides a compelling argument for the proposed array-valued AD system in JuMP.

Coming up with an efficient approach to compute the Hessian for these types of problems is certainly a challenge and one of the reasons that the JuMP milestone is taking some time. Out of curiosity, did the memoize approach fail using the master branch? If so, can you provide some more details on the problem that caused the failure? This would be helpful info to consider going forward with the planned JuMP developments.

Depending on the types of operations and the length of the traced expressions, InfiniteExaModels may or may not work well. It is really intended for models the define the ODEs/PDEs directly in InfiniteOpt. Perhaps writing a bridge from MTK directly to an InfiniteModel would be better than generating the evaluation function, but of course this would require some development. I have considered such a bridge in the past, but I don't know MTK that well and I have limited development time these days.

@baggepinnen
Copy link
Author

Out of curiosity, did the memoize approach fail using the master branch?

No, it was with latest release. I think that the problem was related to the vast number of methods that were registered by the @register macro when the function had many input arguments. In particular: type_combos = vec(collect(Iterators.product(ntuple(_->Ts, num_args)...))) will be extremely large if you try to register a function that takes long arrays. Since arrays were not supported, you had to splat the arrays and use functions with one scalar input argument per array element. I don't think the new JuMP nonlinear interface currently has anything that can combat this issue?

@pulsipher
Copy link
Collaborator

Out of curiosity, did the memoize approach fail using the master branch?

No, it was with latest release. I think that the problem was related to the vast number of methods that were registered by the @register macro when the function had many input arguments. In particular: type_combos = vec(collect(Iterators.product(ntuple(_->Ts, num_args)...))) will be extremely large if you try to register a function that takes long arrays. Since arrays were not supported, you had to splat the arrays and use functions with one scalar input argument per array element. I don't think the new JuMP nonlinear interface currently has anything that can combat this issue?

Actually it does. Now the @operator macro makes an operator object that doesn't require or create a bunch of methods like @register. For this use case, it should be much faster. If you have the time, I suggest giving it a try to see if it works.

@aml5600
Copy link

aml5600 commented Sep 20, 2024

Hello! I am in a very similar boat to @baggepinnen , I have an MTK model sprinkled with some vector-in, vector-out functions that would probably work best behind an @operator-esque wall.

@baggepinnen , did you have any luck with using @operator? I will experiment with this approach but curious if you ran into any pitfalls/roadblocks?

Thanks to you both!

@baggepinnen
Copy link
Author

I did not manage to make use of the @operator, the nonlinear expressions in JuMP were much harder to work with than the NLExpr in InfiniteOpt. I ended up writing the entire collocation transcription manually and interfaced Ipopt directly through MatOptInterface bypassing JuMP, it gave me more than 10x speedup on my particular problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants