Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runge-Kutta Stepper Assertion, main branch (2024.05.04.) #568

Open
krasznaa opened this issue May 4, 2024 · 2 comments
Open

Runge-Kutta Stepper Assertion, main branch (2024.05.04.) #568

krasznaa opened this issue May 4, 2024 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@krasznaa
Copy link
Member

krasznaa commented May 4, 2024

While playing with traccc_seq_example_cuda on the ODD simulations files that I made as described in #561, I ran into the following assertion:

WARNING: @traccc::io::csv::read_cells: 162 duplicate cells found in /data/ssd-1tb/projects/traccc/traccc/data/odd/geant4_10muon_100GeV/event000000025-cells.csv
/data/ssd-1tb/projects/traccc/build-debug/_deps/detray-src/core/include/detray/propagator/rk_stepper.ipp:756: __nv_bool detray::rk_stepper<magnetic_field_t, algebra_t, constraint_t, policy_t, inspector_t, array_t>::step(propagation_state_t &, const detray::stepping::config<detray::detail::get_scalar<algebra_t, void>::scalar> &) [with propagation_state_t = detray::propagator<detray::rk_stepper<covfie::field_view<covfie::backend::constant<covfie::vector::vector_d<float, 3UL>, covfie::vector::vector_d<float, 3UL>>>, detray::cmath<float>, detray::constrained_step<detray::darray>, detray::stepper_rk_policy, detray::stepping::void_inspector, detray::darray>, detray::navigator<const detray::detector<detray::default_metadata, detray::container_types<vecmem::device_vector, detray::tuple, detray::darray, vecmem::jagged_device_vector, detray::dmap>>, detray::navigation::void_inspector, detray::intersection2D<detray::surface_descriptor<detray::detail::typed_index<detray::default_metadata::mask_ids, unsigned int, unsigned int, 4026531840U, 268435455U>, detray::detail::typed_index<detray::default_metadata::material_ids, unsigned int, unsigned int, 4026531840U, 268435455U>, unsigned int, unsigned short>, detray::cmath<float>>>, detray::actor_chain<std::tuple, detray::pathlimit_aborter, detray::parameter_transporter<detray::cmath<float>>, traccc::interaction_register<detray::pointwise_material_interactor<detray::cmath<float>>>, detray::pointwise_material_interactor<detray::cmath<float>>, traccc::ckf_aborter>>::state, magnetic_field_t = covfie::field_view<covfie::backend::constant<covfie::vector::vector_d<float, 3UL>, covfie::vector::vector_d<float, 3UL>>>, algebra_t = detray::cmath<float>, constraint_t = detray::constrained_step<detray::darray>, policy_t = detray::stepper_rk_policy, inspector_t = detray::stepping::void_inspector, array_t = detray::darray]: block: [0,0,0], thread: [6,0,0] Assertion `stepping._initialized == false` failed.
terminate called after throwing an instance of 'std::runtime_error'
  what():  /data/ssd-1tb/projects/traccc/traccc/device/cuda/src/finding/finding_algorithm.cu:493 Failed to execute: cudaMemcpyAsync(&global_counter_host, global_counter_device.get(), sizeof(device::finding_global_counter), cudaMemcpyDeviceToHost, stream) (device-side assert triggered)
Aborted (core dumped)

For "low-intensity" events it doesn't show up, but at higher intensities it does. 😕

The code is of course this: https://github.com/acts-project/detray/blob/main/core/include/detray/propagator/rk_stepper.ipp#L721-L760 Without actually knowing what's going on there, it just seems buggy. 🤔 Since having an assertion for a state that in the very next line the code handles gracefully, does not seem correct.

I was thinking of opening this in the detray repository, but since the error shows up most easily using the code of this repository, this seemed easier. 🤔

Note that if the assertions are disabled (not using CMAKE_BUILD_TYPE=Debug...), then I don't see any obvious errors coming from this code. So on first order it would just seem that this one assertion should be removed...?

@krasznaa krasznaa added the bug Something isn't working label May 4, 2024
@beomki-yeo
Copy link
Contributor

beomki-yeo commented May 4, 2024

It is possibly a crazy particle causing a problem but I don't know exactly why.
Unless it is a debug build, it won't cause any problem because the propagation will get aborted:

    assert(stepping._initialized == false);
    // If the stepper state is still in the initialized state, abort.
    if (stepping._initialized == true) {
        return navigation.abort();
    }

So let's not remove this assertion yet before we understand the problem. Or do you need to make things fully work for the debug build?

EDIT: Maybe it doesn't matter. I will just let you decide 🤔

@krasznaa
Copy link
Member Author

krasznaa commented May 4, 2024

Unfortunately this assertion prevents any other debugging to be done on the full chain application at the moment. 😦 So, with everything else also going on (for instance #569), we really need to silence it for now.

Note that this will need a PR into the Detray repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants