Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: initial implementation of new DiscreteSystem #2507

Merged
merged 13 commits into from
Mar 22, 2024

Conversation

AayushSabharwal
Copy link
Member

@AayushSabharwal AayushSabharwal commented Feb 28, 2024

Checklist

  • Appropriate tests were added
  • Any code changes were done in a way that does not break public API
  • All documentation related to code changes were updated
  • The new code follows the
    contributor guidelines, in particular the SciML Style Guide and
    COLPRAC.
  • Any new documentation only uses public API

Additional context

Add any other context about the problem here.

using ModelingToolkit
@parameters σ=28.0 ρ=10.0 β=8/3 δt=0.1
@variables t x(t)=1.0 y(t)=0.0 z(t)=0.0
D = Difference(t; dt=δt)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs update?

"""Array variables."""
var_to_name::Any
"""Control parameters (some subset of `ps`)."""
ctrls::Vector
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should go away?

"""
parent::Any

function DiscreteSystem(tag, discreteEqs, iv, dvs, ps, tspan, var_to_name, ctrls,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ctrls

Constructs a DiscreteSystem.
"""
function DiscreteSystem(eqs::AbstractVector{<:Equation}, iv, dvs, ps;
controls = Num[],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
controls = Num[],

@baggepinnen
Copy link
Contributor

The old discrete systems had a lot of correctness issues. Some of the issues reporting those have been closed since the old stuff was removed, but it would probably still be good to add the reported MWEs to the tests. Looking through the old issues, many were related to algebraic variables, such as

Perhaps adding an error if not all equations are difference equations would be sufficient for now?

@AayushSabharwal AayushSabharwal force-pushed the as/discrete-system branch 3 times, most recently from 88b73fe to a41639d Compare March 4, 2024 10:01
@AayushSabharwal AayushSabharwal marked this pull request as ready for review March 4, 2024 10:39
Comment on lines 45 to 51
# if sys isa DiscreteSystem
# eqs = difference_order_lowering(sys)
# @set! sys.unknowns = [eq.lhs for eq in eqs]
# @set! sys.eqs = eqs
# @show parameters(sys)
# return sys
# end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's this commented for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should have been deleted

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in e596393

Comment on lines 34 to 36
Currently, all updates happen at integer intervals (`t = 1, 2, 3...`). All shifts
in the equations must be positive. For example, instead of specifying the update equation
for the Fibonacci series as `x ~ x(k-1) + x(k-2)` use `x(k+2) ~ x(k+1) + x`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not touch that right now 😅 . Just remove that from the docs

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@@ -0,0 +1,36 @@
# Modeling Discrete Systems
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mark as experimental until the indexing thing is solidified.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@@ -28,7 +28,7 @@ struct Shift <: Operator
Shift(t, steps = 1) = new(value(t), steps)
end
Shift(steps::Int) = new(nothing, steps)
normalize_to_differential(s::Shift) = Differential(s.t)^s.steps
normalize_to_differential(s::Shift) = Differential(s.t)^abs(s.steps)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not correct. It changes u(k)~u(k-2)-u(k-1) to u(k)~u(k+2)-u(k+1) which obviously isn't the same.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bump

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link
Member

@YingboMa YingboMa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's support non-positive shift properly before merging.

@AayushSabharwal
Copy link
Member Author

d9891cb fixes a bug introduced by #2537


u0 = [S => 990.0, I => 10.0, R => 0.0]
p = [β => 0.05, c => 10.0, γ => 0.25, δt => 0.1]
tspan = (0.0, 100.0) # value function (from Symbolics) is used to convert a Num to Float64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this comment mean?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It means I copied the example from tests and didn't notice the comment 😅

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@AayushSabharwal AayushSabharwal force-pushed the as/discrete-system branch 2 times, most recently from c69e12b to d668b8f Compare March 13, 2024 11:45
@AayushSabharwal
Copy link
Member Author

Codecov is very unhappy for some reason?

@ChrisRackauckas
Copy link
Member

@AayushSabharwal AayushSabharwal force-pushed the as/discrete-system branch 2 times, most recently from ea84b09 to a3b85cb Compare March 14, 2024 05:45
@AayushSabharwal
Copy link
Member Author

Docs don't want to build for some reason.

I also found a minor bug, but it is caused by the changes in this PR so merging this will have to wait a little bit

[kp => 1.0; z => 3.0; z(k + 1) => 2.0])
@test sort(vcat(prob.p...)) == [0, 1.0, 2.0, 3.0, 4.0] # yd, kp, z(k+1), z(k), ud
[kp => 1.0; ud(k - 1) => 2.0; ud(k - 2) => 2.0])
@test sort(vcat(prob.p...)) == [0, 1.0, 2.0, 2.0, 2.0] # yd, Hold(ud), kp, ud(k - 1)
sol = solve(prob, Tsit5(), kwargshandle = KeywordArgSilent)
Copy link
Contributor

@baggepinnen baggepinnen Mar 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this test here, including changing the initial condition for x = 0.1

prob = ODEProblem(ss, [x => 0.1, y => 0.0], (0.0, Tf),
    [kp => 1.0; ud(k - 1) => 2.0; ud(k - 2) => 2.0])

#      kp  yd(0)  ud(k-1)
ud_0 = 1 * 0.1    + 2; # This computes what ud should be at time t = 0
@test sol(0+eps(), Val{1})[]  -sol(0+eps(), idxs=x) + ud_0

it fails. I check that
D(x)(0) is correctly calculated (I got errors trying to access sol(0, idxs=D(x)) by adding -x(0) and ud(0)

Copy link
Member Author

@AayushSabharwal AayushSabharwal Mar 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this discrepancy is because the init function returned by clock inference codegen needs to also perform a discrete update, since the unknowns are now ud(k-1) and ud(k-2) (past values). Currently, it interprets ud(k-1) as the value of ud at t=0. This sort of discrepancy comes about because the unknowns are named ud(k-1) and ud(k-2) which implies that they are past values. However, after the affect! function is run, the value of ud(k-1) is the value of ud at the current time. Problem initialization assumes that the initial values are the ones after affect! has been run, so to speak.

In other words, at time t, when the callback is scheduled to run, ud(k-1) and ud(k-2) are the two past values of ud. After the callback is executed (and we are still at time t), ud(k-1) now stores the value of ud at time t, and ud(k-2) is the previous value. Problem initialization expects that the initial values given are those after the callback has been run at t=0

Copy link
Member Author

@AayushSabharwal AayushSabharwal Mar 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this interpretation, the value of 1.9 returned by sol(0+eps(), Val{1}) in this example is correct, since ud at the current time is considered to be 2.0 and x is 0.1

However, the initialization code should be updated considering the new nomenclature.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, it interprets ud(k-1) as the value of ud at t=0

Then we should change this. At each tick of the clock, x(k) refers to the new value at the tick time, so ud(k) refers to the value of ud after the clock has ticked at t = 0, while ud(k-1) refers to the first past value (before t=0).

The shift-index notation $x(k)$ is standard in the literature of discrete-time systems and is short for $x(t_0 + k \Delta T)$ where $\Delta T$ is the tick interval of the clock. A discrete-time difference equation describes something that holds true for all $k$, if we insert the start time $t_0 = 0$ and $k=0$ in the equation for ud, we get the value of ud at the initial time ud(0) = kp*yd(0) + ud(-1)

The reason I took the time to clone this PR and add this test is that these details are extremely important. The old discrete-time functionality that we threw away had a lot of small nuances like this left undocumented and untested, and as a result it was completely impossible to rely on it for any serious real-world use case. I want to avoid going back to the same situation again, it's worth spending the time to get this right :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sort of discrepancy comes about because the unknowns are named ud(k-1) and ud(k-2)

This is an implementation detail and should not influence how the system behaves from the perspective of the user. We can call the unknowns Aayush and Fredrik if it makes it less confusing :)

The choice of state ud(k-1), ud(k-2) is a slightly strange, we are solving for ud(k), so I would probably have chosen this variable to be part of the state rather than the two past values, c.f.
https://juliahub.slack.com/archives/C03UEC0P6G0/p1710482554351029?thread_ts=1710481464.183949&cid=C03UEC0P6G0

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed now

Copy link
Contributor

@baggepinnen baggepinnen Mar 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sweet :)

When we get to the stage where we can access discrete-time variables from the solution, I will be equally picky with exactly what index is obtained. sol(t0 + k*dt, idxs=ud) should get ud as computed for tick k, even if this variable happens to be called u(k-1) internally. When the solution object is plotted, the plot label should also refer to u(t0 + k*dt) as u(k).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that shouldn't be too difficult

Copy link
Contributor

@baggepinnen baggepinnen Mar 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's somewhat of an artifact of using nonpositive shifts. Consider the continuous-time case

Is this analogy doing us any service?

Similarly, for x(k) ~ f(x(k-1), x(k-2)) we take x(k-1) and x(k-2) as the unknowns, and compute x(k) which is now reinterpreted as "the next value of x(k-1)".

Just a comment on the terminology here. The fundamental unknown is the discrete-time function x(k) which is a function of the iteration index k. There is only one function x, and this function is defined on the domain k \in {..., -1, 0, 1, 2, ...}. In order to solve for this unknown function, we choose a state realization comprised out of the state variables x(k-1), x(k-2). Here is where our discussion might become a bit confusing; It's quite common to change the names of the state variables completely into something more like

X(k) = [X1(k) X2(k)] = [x(k-1), x(k-2)]

so that you can speak about the (singular) state at time index k being X(k), and this state is comprised of the state variables X(k) = [X1(k) X2(k)] =[x(k-1), x(k-2)]. In this discussion so far, we have conflated the unknown we are solving for with the state realization we have chosen in order to do so. The reason that these two are typically separated is partly to avoid this confusion.

MTK introduces new "unknowns" to represent the same functionx but indexed at different points in time, like x(k-1) and x(k-2), but I consider these unknowns computational tool rather than a property of the system being modeled. Equations are solved for the unknown variables in terms of the known variables, so the only unknown in x(k) ~ f(x(k-1), x(k-2)) is x(k), this is why "state" is a better description here. The state is what we need to store as memory in order to summarize the past to compute the future. This is not the necessarily the same as the unknowns in the equations we are solving.

The terms "state" and "unknowns" are not interchangeable, which was the fundamental reason we made the switch from "states" (the plural form was wrong on so many levels) to "unknowns". After structural simplify, the term "state" is more descriptive, since structural simplification can be seen as the selection of a state realizaition among all the unknowns.

I think that shouldn't be too difficult

Probably not, but will it forever haunt us in the sense that every person looking into this in the future will be confused by the presence of a one-step shift in many places?

Sorry for all the long text here, I just want to make sure we fully understand each other very precisely and end up with a nice and maintainable solution that will be easy to communicate precisely to users so that we avoid the long list of bugs and confusions that plagued the old implementation.

Copy link
Member Author

@AayushSabharwal AayushSabharwal Mar 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification on terminology. I've been saying "unknowns" everywhere purely because that's what they are to MTK. I now understand the semantic difference between unknowns and the state realization.

Is this analogy doing us any service?

Not really, but it's how structural_simplify works. The state realization is the lowest k-1 order states. Before this PR, those states included x(k) just because it was the lowest order. Now x(k) is the highest order state, so to speak, and isn't included. I could "fix" this by shifting all discrete-time variables by one timestep prior to any transformations (so x(k) ~ x(k-1) + x(k-2) becomes x(k+1) ~ x(k) + x(k-1)) if it is required. However, I don't fully trust the robustness of this "fix".

Probably not, but will it forever haunt us in the sense that every person looking into this in the future will be confused by the presence of a one-step shift in many places?

For indexing, this should only be a one-step shift in one place: SII.parameter_index. If we receive an unshifted symbol which isn't a parameter, but Shift(t, -1)(sym) is, then we return the index of the latter.

The plot recipes would also need some special handling, though, and we don't have a way to directly handle that yet. getname doesn't work for shifted variables, so we could just add a method where getname(x(k-1)) returns :x, and we add the (k) during plotting

@AayushSabharwal
Copy link
Member Author

The optimization errors seem to be due to bugs upstream in the Optimization stack. The generated functions look good to me, it's probably an error with how they're called

# Independent and dependent variables and parameters
@parameters c nsteps δt β γ
@constants h = 1
@variables S(t) I(t) R(t)
Copy link
Contributor

@baggepinnen baggepinnen Mar 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I modify this problem to add an observed variable

@variables S(t) I(t) R(t) R2(t)

# Equations
eqs = [S ~ S(k - 1) - infection * h,
    I ~ I(k - 1) + infection - recovery,
    R ~ R(k - 1) + recovery,
    R2 ~ R]

and then try to plot the observed variable it does not work

julia> plot(sol_map, idxs=[R, R2])
ERROR: ArgumentError: R2(t) is neither an observed nor an unknown variable.

Will this somehow be handled?

Copy link
Member Author

@AayushSabharwal AayushSabharwal Mar 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you check observed(sys) you'll notice that it'll be something like R2(k+1) ~ R + recovery which is an artifact of shifting the entire system forward one timestep. Plotting R2(k+1) should work. While it's not great user API, it's the only way I found of making this correct, since in general we only have the capability of calculating the next state given the current state, even for observed variables.

Semantically, everything for DiscreteSystem should be correct. The saved values correspond to the states (i.e. S(k) is chosen to be a state, and the saved timeseries is for S(k). Previously, we chose S(k-1) as the state and saved S(k)) and the observed equations are correct (albeit for the next timestep). Plotting R2(k+1) should plot a shifted R(k). I haven't tested this, which I'll do and add to the testcases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you open an issue about the user interface piece here and we merge?

Copy link
Contributor

@baggepinnen baggepinnen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@@ -0,0 +1,51 @@
# (Experimental) Modeling Discrete Systems
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it still experimental?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kind of, unless observed variables are working as intended now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, since shifting the entire system by one step is basically a workaround and eventually we'll need proper compiler support for this. I've discussed with @baggepinnen and that will end up changing things such as the state realization and observed equations in the simplified system.

@ChrisRackauckas
Copy link
Member

Test failures

@AayushSabharwal
Copy link
Member Author

Fixed and rebased

@AayushSabharwal
Copy link
Member Author

CI failures are due to codecov

@ChrisRackauckas ChrisRackauckas merged commit 9ef2981 into SciML:master Mar 22, 2024
7 of 24 checks passed
@AayushSabharwal AayushSabharwal deleted the as/discrete-system branch March 22, 2024 14:56
@AayushSabharwal
Copy link
Member Author

Finally 🥳

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants