Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for cut timestep size nonlinear solver strategy #6235

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

jdannberg
Copy link
Contributor

If the nonlinear solver fails in a timestep with mesh refinement and "cut timestep size" is selected as the Nonlinear solver failure strategy, ASPECT now correctly repeats the timestep first before refining the mesh. Before, ASPECT would execute refinement/coarsening of the mesh first, which could lead to changes of the mesh in each failure cycle.

I also added a test case: If you run it on main, it refines the mesh in timestep 1 before the cutback, so the solve fails again (because the mesh is now different) triggering a second cutback. With the current version, the solve succeeds after 1 cutback, refining the mesh and moving to timestep 2 afterwards.

Some background on this:
I tried out the "cut timestep size" nonlinear solver strategy and it made a model crash that otherwise simply had the nonlinear solver not converge in some steps. I looked for the reason, and it turned out the temperature in my model had increased by a constant everywhere (see image below). This was a way bigger change than between individual time steps. I don't know why changing the mesh first before repeating the time step would lead to the temperature suddenly having really weird values (the change in temperature had a linear relationship to the "Cut back factor"). But in any case, I think refining the mesh first is not what we want.

failure
Screenshot showing temperature before (green) and after (red) cutback. This PR fixes the problem.

For new features/models or changes of existing features:

  • I have tested my new feature locally to ensure it is correct.
  • I have created a testcase for the new feature/benchmark in the tests/ directory.
  • I have added a changelog entry in the doc/modules/changes directory that will inform other users of my change.

@danieldouglas92
Copy link
Contributor

@jdannberg Yes this is great!! I ran into the same thing and made an issue #6049 many months ago but I completely forgot about it until I saw this. The temperature in my models increased by several hundred degrees. I'll try running that model with your changes in this branch to confirm that it fixes my case as well.

@danieldouglas92
Copy link
Contributor

It did fix my model as well!!! So happy you found the solution to this 🥳

@jdannberg
Copy link
Contributor Author

@danieldouglas92 I totally missed your issue (I should pay attention to these more carefully), but yes, it looks like it's exactly the same problem I saw. Glad this fixed it!

@jdannberg jdannberg requested a review from tjhei February 17, 2025 13:09
Copy link
Member

@gassmoeller gassmoeller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for figuring out the bug! This might be serious enough that we need a point release (a bug that is easy enough to trigger and that changes the solution without crashing is dangerous!)

However, I have one wrinkle to this solution: We have one reaction that is called refine_and_repeat_step, I think this should then refine the mesh and then repeat the time step on solver failure. With the change you did here would that reaction still work? It seems to me we would then skip the refinement altogether and just repeat the time step. Maybe we need to split the fix as follows:

        if (time_stepping_manager.should_refine_mesh())
          {
            pcout << "Refining the mesh based on the time stepping manager ...\n" << std::endl;
            refine_mesh(max_refinement_level);
          }

        if (time_stepping_manager.should_repeat_time_step())
          {
...
 }
        if (time_stepping_manager.should_refine_mesh() == false)
          maybe_refine_mesh(new_time_step_size, max_refinement_level);

So that we split the regular refinement from the refinement for time-stepping reasons.
Did I understand this correctly?

@jdannberg
Copy link
Contributor Author

This does make sense to me. But I did not see a test or a place where either the refine_and_repeat_step or the refine_and_advance reactions are being used.

Copy link
Member

@tjhei tjhei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. I think the logic makes sense. 👍🏻

@jdannberg jdannberg force-pushed the failure_cut_timestep_fix branch from 280170e to acfd813 Compare February 19, 2025 13:13
@gassmoeller gassmoeller mentioned this pull request Feb 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants