Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an option to skip integrated BCs when the variable is not defined next to the boundary #29700

Open
wants to merge 5 commits into
base: next
Choose a base branch
from

Conversation

GiudGiud
Copy link
Contributor

@GiudGiud GiudGiud commented Jan 17, 2025

closes #29360

@jmeier

Came back to it since it may help #29699 as well

@GiudGiud GiudGiud self-assigned this Jan 17, 2025
@moosebuild
Copy link
Contributor

moosebuild commented Jan 17, 2025

Job Documentation, step Docs: sync website on 0129013 wanted to post the following:

View the site here

This comment will be updated on new commits.

@GiudGiud
Copy link
Contributor Author

GiudGiud commented Jan 21, 2025

Looks like this triggers errors on both the HFEM and mortar BCs. I will be a little delayed in investigating this so anyone feel free to take over

@GiudGiud GiudGiud marked this pull request as ready for review February 3, 2025 05:44
@GiudGiud GiudGiud requested a review from lindsayad as a code owner February 3, 2025 05:44
@lindsayad
Copy link
Member

looks like some associated test failures

@GiudGiud
Copy link
Contributor Author

GiudGiud commented Feb 5, 2025

Good catch. Pretty unlucky, but recovering post a subdomain change can cause the old subdomain to be missing!

@GiudGiud
Copy link
Contributor Author

GiudGiud commented Feb 5, 2025

otherwise this should be ready I think

@GiudGiud
Copy link
Contributor Author

GiudGiud commented Feb 7, 2025

for real this time

@moosebuild
Copy link
Contributor

moosebuild commented Feb 7, 2025

Job Coverage, step Generate coverage on 0129013 wanted to post the following:

Framework coverage

cd6626 #29700 012901
Total Total +/- New
Rate 85.29% 85.29% -0.00% 85.71%
Hits 108457 108470 +13 18
Misses 18702 18707 +5 3

Diff coverage report

Full coverage report

Modules coverage

Coverage did not change

Full coverage reports

Reports

Warnings

  • framework new line coverage rate 85.71% is less than the suggested 90.0%

This comment will be updated on new commits.

@GiudGiud
Copy link
Contributor Author

Test failures are not related. An openMPI stochastic failure and a build failure on mac

Comment on lines +101 to +103
paramError("skip_execution_outside_variable_domain",
"This boundary condition is being executed outside the domain of definition of "
"its variable");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC paramError is not actually thread-safe

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like something we should fix in paramError though

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you sure because it seems to forward to this, which has a lock in it so someone thought about thread safety


[[noreturn]] void
mooseErrorRaw(std::string msg, const std::string prefix)
{
  if (Moose::_throw_on_error)
    throw std::runtime_error(msg);

  msg = mooseMsgFmt(msg, "*** ERROR ***", COLOR_RED);

  std::ostringstream oss;
  oss << msg << "\n";

  // this independent flush of the partial error message (i.e. without the
  // trace) is here because trace retrieval can be slow in some
  // circumstances, and we want to get the error message out ASAP.
  msg = oss.str();
  if (!prefix.empty())
    MooseUtils::indentMessage(prefix, msg);
  {
    Threads::spin_mutex::scoped_lock lock(moose_stream_lock);
    Moose::err << msg << std::flush;
  }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you add a test for the error, I will check

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok did that, and turned on the clang thread sanitizer recipe on civet so hopefully you dont need to check manually?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Time Step 2, time = 2, dt = 1

We caught a libMesh error in ThreadedElementLoopBase:No index 8 in ghosted vector.
Vector contains [0,6)
And empty ghost array.

Stack frames: 16
0: ___interceptor_backtrace
1: libMesh::print_trace(std::ostream&)
2: libMesh::MacroFunctions::report_error(char const*, int, char const*, char const*, std::ostream&)
3: libMesh::PetscVector<double>::map_global_to_local_index(unsigned long) const
4: libMesh::PetscVector<double>::get(std::vector<unsigned long, std::allocator<unsigned long> > const&, double*) const
5: MooseVariableDataBase<double>::fetchDoFValues()
6: MooseVariableData<double>::computeValues()
7: MooseVariableFE<double>::computeElemValuesFace()
8: SystemBase::reinitElemFace(libMesh::Elem const*, unsigned int, unsigned int)
9: FEProblemBase::reinitElemFace(libMesh::Elem const*, unsigned int, unsigned int)
10: ComputeMaterialsObjectThread::onBoundary(libMesh::Elem const*, unsigned int, short, libMesh::Elem const*)
11: ThreadedElementLoopBase<libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*> >::operator()(libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*> const&, bool)
12: void* libMesh::Threads::run_body<libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*>, ComputeMaterialsObjectThread>(void*)
13: ../../../moose_test-devel(+0x62833) [0x55f9012a8833]
14: /lib/x86_64-linux-gnu/libc.so.6(+0x9ca94) [0x7f90bf017a94]
15: __clone
[0] /data/lindad/projects/thread-debugging/scripts/../libmesh/installed/include/libmesh/petsc_vector.h, line 1080, compiled Feb 11 2025 at 13:26:10

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok so materials are being computed where they should not. I guess that's expected

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok found the problem, the subdomain modifier object is triggering material evaluations.

I can fix the problem but not really make it waterproof to someone else doing the same thing from another object

basically each line of this backtrace does not seem like it has enough context to me? To make a useful error at least.
I feel like can make marginally improve on the petsc OOB here 5: MooseVariableDataBase<double>::fetchDoFValues()

5: MooseVariableDataBase::fetchDoFValues()
too deep, does not know about boundaries so cant really inform on the root cause

6: MooseVariableData::computeValues()
too deep, does not know about boundaries so cant really inform on the root cause

7: MooseVariableFE::computeElemValuesFace()
too deep, does not know about boundaries so cant really inform on the root cause

8: SystemBase::reinitElemFace(libMesh::Elem const*, unsigned int, unsigned int)
not deep enough, dont even know which variable is missing on this elem + side

9: FEProblemBase::reinitElemFace(libMesh::Elem const*, unsigned int, unsigned int)
does not know about variables

10: ComputeMaterialsObjectThread::onBoundary(libMesh::Elem const*, unsigned int, short, libMesh::Elem const*)
does not know about variables

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and my fix only catches the test. I ll add an error

Copy link
Contributor Author

@GiudGiud GiudGiud Feb 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be fine now. Though there s no real guarantees that it would not happen again with another system, that also pre-computes things and that also checks for boundaries existing before pre-computing, would get in the same bad re-init.

- add a test for the error
@GiudGiud GiudGiud force-pushed the PR_moving_neumann branch 2 times, most recently from b316f91 to 8b07406 Compare February 12, 2025 04:57
- prevent OOB access when initializing stateful material properties on boundary
@lindsayad
Copy link
Member

geomsearch failure looks real

@moosebuild
Copy link
Contributor

Job Framework 2 on 0129013 : invalidated by @GiudGiud

I think it s unrelated

@GiudGiud
Copy link
Contributor Author

it passed fine

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for not executing integrated BCs when the variable is no longer defined near that boundary
3 participants