Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jigsaw - error propagation not completing #5721

Open
lwellerastro opened this issue Feb 3, 2025 · 0 comments
Open

Jigsaw - error propagation not completing #5721

lwellerastro opened this issue Feb 3, 2025 · 0 comments
Labels
bug Something isn't working

Comments

@lwellerastro
Copy link
Contributor

ISIS version(s) affected: 8.3.0

Description
Error propagation has been running for nearly 100 days on my Kaguya TC global control network and I have no idea when it will finish. This network bundles without err prop in under 50 hours. I don't know where it is in the process because I have disabled the slurm standard out since a previous run that was approaching 30+ days was generating many hundreds of gigs in size (I needed to kill that run due to cluster and scratch maintenance). My current job appears to be running according to top (not dead or sleeping). This run time seems excessively long but I don't know how long it should take.

This is not necessarily a bug, but I do not want to post to the discussion board where this problem will receive zero attention.

How to reproduce
All of my data are on /scratch due to its size. The run is occurring under my scratch area Kaguya_TC/Global/Morning/GlobalNetwork/ErrProp/. Here are the specs for the network followed by my jigsaw command to see what is being solved for:

# of images: Kaguya_TC/Global/Morning/GlobalNetwork/ErrProp
# of points: 11573190
# of measures: 45876606

jigsaw froml=KTC_Morning_Global_Combined_ImageValDNAdd_GCP_Fix2.lis cnet=KTC_Morning_Global_Combined_ImageValDNAdd_GCP_Fix2.net \
        onet=JigOut_ErrProp_KTC_Morning_Global_Combined_ImageValDNAdd_GCP_Fix2.net \
        radius=yes update=no errorprop=yes \
        sigma0=1.0e-5 maxits=10 \
        camsolve=accelerations twist=yes overexisting=yes \
        spsolve=position overhermite=yes \
        camera_angles_sigma=0.25 \
        camera_angular_velocity_sigma=0.1 \
        camera_angular_acceleration_sigma=0.01 \
        spacecraft_position_sigma=1000 \
        point_radius_sigma=100 \
        file_prefix=ErrProp_RadAccelTwist_SpkPos_ValDNAddGCP_Fix2

Yes, it's enormous, but how long should this take? Is there a way to calculate the amount of time necessary? Other than disabling the standard output as I have done, is it possible (via another post) to disable all of the text going to screen which is resulting in a massive log file? It's useful knowing something is happening during the error prop process, but can users get that information in with less verbosity?

I would like like my bundle to finish but it has been hogging a resource for months now and it's not clear to me there is any end in sight.
Any suggestions on how guesstimate how long this needs are appreciated.

@lwellerastro lwellerastro added the bug Something isn't working label Feb 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant