Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coevolve aggregator for VariableRateJump #276

Merged
merged 75 commits into from
Jan 3, 2023

Conversation

gzagatti
Copy link
Contributor

@gzagatti gzagatti commented Dec 7, 2022

This PR introduces the Coevolve aggregator for VariableRateJump. This aggregator supports the SSAStepper() integrator for the efficient simulation of jump processes with time-varying rates.

Some background, in JumpProcesses.jl most of the algorithms are borrowed from the biochemistry literature. However, jump processes have also been extensively studied in the statistics literature where they go by point processes (see Daley and Vere-Jones [1]). In the point process literature, we find many of the simulation algorithms implemented in this library that go by other names. Among the most popular one is Ogata's algorithm [2] which is very similar to Gillespie's direct method. Ogata's algorithm as adapted in Daley [1] is a general-purpose algorithm for any point process that evolves through time. Farajtabar et al. [3] modifies Ogata's algorithm to use a priority queue in order to simulate a compound Hawkes process using a priority queue. Here I adapt Farajtabar's COEVOLVE algorithm to simulate any evolutionary point process, thus the name of the algorithm.

This algorithm requires user-defined bounds and a dependency graph. Therefore, I extend the VariableRateJump to accept three aditional keyword arguments lrate, urate and L. lrate(u, p, t) is a function that computes the lower-bound of the rate from t to t + L, urate computes the upper-bound, and L computes the interval for which the bounds are valid.

A toy example is given below

using JumpProcesses, Plots
L(u,p,t) = (1/p[1])*2
rate(u,p,t) = t*p[1]*u[1]
lrate(u, p, t) = rate(u, p, t)
urate(u,p,t) = rate(u, p, t + L(u,p,t))
affect!(integrator) = integrator.u[1] -= 1
vrj = VariableRateJump(rate, affect!; lrate=lrate, urate=urate, L=L)
u0 = [1_000.]
p = (1.0,)
tspan = (0., 10.)
prob = DiscreteProblem(u0, tspan, p)
jprob = JumpProblem(prob, Coevolve(), vrj; dep_graph=[[1]])
sol = solve(jprob, SSAStepper())
plot(sol, title="Trajectory", legend=false)

image

I have performed some benchmarks to evaluate the performance of the model. I ran the jump benchmarks of SciMLBenchmarks. Coevolve performs on par with the alternatives, with a fixed penalty over NRM. The QueueMethod basically reduces to the NRM when all the jumps are ConstantRateJump or MassActionJump, but I was not able to equate the NRM performance.

Benchmark Direct FRM SortingDirect NRM DirectCR RSSA RSSACR Coevolve
Diffusion CTRW 5.1 s 1.2 s 0.7 s 0.4 s 1.6 s 0.4 s 1 s
Multistate Model 0.1 s 0.2 s 0.1 s 0.2 s 0.2 s 0.1 s 0.1 s 0.4 s
Neg. Feed. Gene Expression 0.2 ms 0.3 ms 0.2 ms 0.4 ms 0.4 ms 0.4 ms 0.8 ms 0.4 ms
Marchetti Gene Expression 0.4 s 0.6 s 0.4s 0.9 s 0.0.9 s 0.6 s 0.9 s 1.3 s

More complex examples include the compound Hawkes process which was the main motivation for contributing this PR.

image

I have also developed a new benchmark using the compound Hawkes process and found significant improvements compared to using the ODE solver. Please check my branch of SciMLBenchmarks for more details. Note that the Direct method does not run within the allowed time when the number of nodes reaches above 40.

V Direct (n) Direct (median time) Coevolve (n) QueueMethod (median time)
1 50 124.5 μs 50 3.4 μs
10 50 28.6 ms 50 455.5 μs
20 50 223.8 ms 50 5.2 ms
30 50 715.1 ms 50 7.9 ms
40 31 715.1 ms 50 30.2 ms
50 15 4.1 s 50 61.7 ms
60 10 6.7 s 50 137.1 ms
70 6 10.9 s 50 202.9 ms
80 4 17.8 s 50 346.4 ms
90 3 27.24 s 50 642.0 ms

I have added unit tests and updated the documentation to reflect the changes introduced here.

As I completed this PR, I learnt about PR #252 which also extends VariableRateJump. The author implements a different algorithm which also uses rate bounds. While the algorithm implemented in that PR is analogous to rejection-based algorithms, the algorithm in this PR is analogous to next-reaction methods. So they both face different trade-offs.

I am looking forward to your feedback.

[1] D. J. Daley and D. Vere-Jones, An Introduction to the Theory of Point Processes: Volume I: Elementary Theory and Methods, 2nd ed. New York: Springer-Verlag, 2003. doi: 10.1007/b97277.

[2] Y. Ogata, “On Lewis’ simulation method for point processes,” IEEE Transactions on Information Theory, vol. 27, no. 1, pp. 23–31, Jan. 1981, doi: 10.1109/TIT.1981.1056305.

[3] M. Farajtabar, Y. Wang, M. Gomez-Rodriguez, S. Li, H. Zha, and L. Song, “COEVOLVE: a joint point process model for information diffusion and network evolution,” J. Mach. Learn. Res., vol. 18, no. 1, pp. 1305–1353, Jan. 2017, doi: 10.5555/3122009.3122050.

@isaacsas
Copy link
Member

@gzagatti just to update you. I've gone through and updated a subset of the tutorials, along with updating some of the new doc strings. I also changed L to rateinterval as I think we need something more descriptive for this field. If you have a better name though I'm happy to change it again.

Stil TODO on my end:

  1. Finish tweaking the docs I haven't yet gotten to.
  2. Figure out how we want to handle JumpProblem. It seems quite a bit more complicated now as rewritten, so it would be nice to try to simplify it. Also, we'd like to make sure we can handle mixes of the two VariableRateJump types now (what I call general vs. bounded VariableRateJumps in the tutorial/docs I've updated). Adding a test for this would be good too (i.e. a mix of bounded and unbounded VariableRateJumps, and hence requiring being over an ODEProblem, but still using Coevolve for the bounded jumps).
  3. (Future work): After this PR we should figure out why performance is below NRM for non-VariableRateJump systems. Given we are using FunctionWrappers there may also be a small gain to be had by keeping lrate=nothing when no lrate is specified, and skipping the call to the lrate function in the rejection sampling code.

A few minor comments for you:

  • I updated the docs so they can build locally. As they are dynamic it is good to try that out yourself to make sure everything looks ok and the links/examples all work. You just go in the doc folder, activate the Project.toml there, and call include("make.jl"). The resulting build directory should then have the local doc build.
  • Since we assume Julia 1.6, you don't need f(u; p = p) you can just say f(u; p) and if there is a variable with the name p it will get picked up for you.

Copy link
Contributor Author

@gzagatti gzagatti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the doc updates. They look much better. I have one minor comments for you to consider.

Also, great to learn about a simpler way of passing keyword arguments.

I will build the docs locally to see if everything is working.

By the way, I re-ran the benchmarks. I get closer results. Coevolve sometimes beats NRM.

Benchmark Direct FRM SortingDirect NRM DirectCR RSSA RSSACR Coevolve
Diffusion CTRW 4.85 s 15.88 s 1.17 s 0.87 s 0.43 s 1.91 s 0.37 s 0.77 s
Multistate Model 0.12 s 0.22 s 0.12 s 0.27 s 0.20 s 0.11 s 0.16 s 0.28 s
Neg. Feed. Gene Expression 0.17 ms 0.24 ms 0.21 ms 0.44 ms 0.38 ms 0.35 ms 0.73 ms 0.40 ms
Marchetti Gene Expression 0.44 s 0.62 s 0.39 s 0.75 s 0.79 s 0.59 s 0.83 s 0.96 s

Some of the differences probably come from my machine (personal work laptop), so running them on a dedicated server might provide more consistent results. However, it seems that it is only the Marchetti model that displays substantial and persistent difference.

docs/src/tutorials/discrete_stochastic_example.md Outdated Show resolved Hide resolved
docs/src/tutorials/discrete_stochastic_example.md Outdated Show resolved Hide resolved
@isaacsas
Copy link
Member

isaacsas commented Jan 2, 2023

@gzagatti do the changes I made look ok to you? Otherwise I think this is good to go (though some more tests mixing bounded and general VariableRateJumps, along with the other jump types would be good at some point).

Copy link
Contributor Author

@gzagatti gzagatti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your review.

I like the simplifications to the JumpProblem and the Coevolve aggregator.

I have made some comments on the documentation, otherwise it's all good on my side.

docs/src/jump_types.md Outdated Show resolved Hide resolved
docs/src/jump_types.md Show resolved Hide resolved
docs/src/tutorials/simple_poisson_process.md Outdated Show resolved Hide resolved
docs/src/tutorials/discrete_stochastic_example.md Outdated Show resolved Hide resolved
docs/src/tutorials/discrete_stochastic_example.md Outdated Show resolved Hide resolved
docs/src/tutorials/discrete_stochastic_example.md Outdated Show resolved Hide resolved
isaacsas and others added 2 commits January 2, 2023 23:25
Co-authored-by: Guilherme Zagatti <[email protected]>
Co-authored-by: Guilherme Zagatti <[email protected]>
@isaacsas isaacsas merged commit a62fdb5 into SciML:master Jan 3, 2023
@isaacsas
Copy link
Member

isaacsas commented Jan 3, 2023

All merged. Thanks @gzagatti this is a great PR.

@gzagatti
Copy link
Contributor Author

gzagatti commented Jan 3, 2023

You're welcome. Thanks for your support @isaacsas!

@xiaomingfu2013
Copy link
Contributor

xiaomingfu2013 commented Jan 5, 2023

Hi, thank you for the great work to support VariableRateJump type! Maybe I missed something, in

if (get_num_majumps(maj) == 0) || !isempty(rs)

it seems to me that !isempty(rs) should check whether the JumpProblem has either ConstantRateJump or VariableRateJump. But in the construction of Coevolve aggregator, rates only refers to VariableRateJump
rates = Vector{RateWrapper}(undef, nvrjs)

should the condition !isempty(rs) be changed to !isempty(urates) instead?
Thank you!

@isaacsas
Copy link
Member

isaacsas commented Jan 5, 2023

Hi-, good catch. Could you submit a PR with that change?

xiaomingfu2013 pushed a commit to xiaomingfu2013/DiffEqJump.jl that referenced this pull request Jan 5, 2023
this pull request is related to SciML#276 (comment)
@gzagatti gzagatti deleted the queue-method-ii branch March 27, 2023 08:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants