-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an option to skip integrated BCs when the variable is not defined next to the boundary #29700
base: next
Are you sure you want to change the base?
Conversation
Job Documentation, step Docs: sync website on 0129013 wanted to post the following: View the site here This comment will be updated on new commits. |
Looks like this triggers errors on both the HFEM and mortar BCs. I will be a little delayed in investigating this so anyone feel free to take over |
next to the boundary closes idaholab#29360
7185678
to
ac69425
Compare
looks like some associated test failures |
ac69425
to
ea4b2a7
Compare
Good catch. Pretty unlucky, but recovering post a subdomain change can cause the old subdomain to be missing! |
otherwise this should be ready I think |
ea4b2a7
to
cbf913f
Compare
for real this time |
Job Coverage, step Generate coverage on 0129013 wanted to post the following: Framework coverage
Modules coverageCoverage did not change Full coverage reportsReports
Warnings
This comment will be updated on new commits. |
Test failures are not related. An openMPI stochastic failure and a build failure on mac |
paramError("skip_execution_outside_variable_domain", | ||
"This boundary condition is being executed outside the domain of definition of " | ||
"its variable"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC paramError
is not actually thread-safe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like something we should fix in paramError though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you sure because it seems to forward to this, which has a lock in it so someone thought about thread safety
[[noreturn]] void
mooseErrorRaw(std::string msg, const std::string prefix)
{
if (Moose::_throw_on_error)
throw std::runtime_error(msg);
msg = mooseMsgFmt(msg, "*** ERROR ***", COLOR_RED);
std::ostringstream oss;
oss << msg << "\n";
// this independent flush of the partial error message (i.e. without the
// trace) is here because trace retrieval can be slow in some
// circumstances, and we want to get the error message out ASAP.
msg = oss.str();
if (!prefix.empty())
MooseUtils::indentMessage(prefix, msg);
{
Threads::spin_mutex::scoped_lock lock(moose_stream_lock);
Moose::err << msg << std::flush;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you add a test for the error, I will check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok did that, and turned on the clang thread sanitizer recipe on civet so hopefully you dont need to check manually?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Time Step 2, time = 2, dt = 1
We caught a libMesh error in ThreadedElementLoopBase:No index 8 in ghosted vector.
Vector contains [0,6)
And empty ghost array.
Stack frames: 16
0: ___interceptor_backtrace
1: libMesh::print_trace(std::ostream&)
2: libMesh::MacroFunctions::report_error(char const*, int, char const*, char const*, std::ostream&)
3: libMesh::PetscVector<double>::map_global_to_local_index(unsigned long) const
4: libMesh::PetscVector<double>::get(std::vector<unsigned long, std::allocator<unsigned long> > const&, double*) const
5: MooseVariableDataBase<double>::fetchDoFValues()
6: MooseVariableData<double>::computeValues()
7: MooseVariableFE<double>::computeElemValuesFace()
8: SystemBase::reinitElemFace(libMesh::Elem const*, unsigned int, unsigned int)
9: FEProblemBase::reinitElemFace(libMesh::Elem const*, unsigned int, unsigned int)
10: ComputeMaterialsObjectThread::onBoundary(libMesh::Elem const*, unsigned int, short, libMesh::Elem const*)
11: ThreadedElementLoopBase<libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*> >::operator()(libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*> const&, bool)
12: void* libMesh::Threads::run_body<libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*>, ComputeMaterialsObjectThread>(void*)
13: ../../../moose_test-devel(+0x62833) [0x55f9012a8833]
14: /lib/x86_64-linux-gnu/libc.so.6(+0x9ca94) [0x7f90bf017a94]
15: __clone
[0] /data/lindad/projects/thread-debugging/scripts/../libmesh/installed/include/libmesh/petsc_vector.h, line 1080, compiled Feb 11 2025 at 13:26:10
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok so materials are being computed where they should not. I guess that's expected
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok found the problem, the subdomain modifier object is triggering material evaluations.
I can fix the problem but not really make it waterproof to someone else doing the same thing from another object
basically each line of this backtrace does not seem like it has enough context to me? To make a useful error at least.
I feel like can make marginally improve on the petsc OOB here 5: MooseVariableDataBase<double>::fetchDoFValues()
5: MooseVariableDataBase::fetchDoFValues()
too deep, does not know about boundaries so cant really inform on the root cause
6: MooseVariableData::computeValues()
too deep, does not know about boundaries so cant really inform on the root cause
7: MooseVariableFE::computeElemValuesFace()
too deep, does not know about boundaries so cant really inform on the root cause
8: SystemBase::reinitElemFace(libMesh::Elem const*, unsigned int, unsigned int)
not deep enough, dont even know which variable is missing on this elem + side
9: FEProblemBase::reinitElemFace(libMesh::Elem const*, unsigned int, unsigned int)
does not know about variables
10: ComputeMaterialsObjectThread::onBoundary(libMesh::Elem const*, unsigned int, short, libMesh::Elem const*)
does not know about variables
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and my fix only catches the test. I ll add an error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be fine now. Though there s no real guarantees that it would not happen again with another system, that also pre-computes things and that also checks for boundaries existing before pre-computing, would get in the same bad re-init.
- add a test for the error
b316f91
to
8b07406
Compare
- prevent OOB access when initializing stateful material properties on boundary
8b07406
to
0129013
Compare
geomsearch failure looks real |
Job Framework 2 on 0129013 : invalidated by @GiudGiud I think it s unrelated |
it passed fine |
closes #29360
@jmeier
Came back to it since it may help #29699 as well