-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cubical Montblanc error #409
Comments
Hi @joesbright. If it is possible, would you mind uploading your sky model file? I want to check that it isn't something specific to the model. The second possibility is that looking at the log, I see that the MS has two spectral windows with a single channel in each. I have never run CubiCal for this particular case, so it is possible the problem could stem from that. |
Hi @JSKenyon, See the sky model below. The data are old VLA data, hence the single channel per SPW. Thanks for the quick response! |
Great! I will take a look in the morning. |
Just a quick update - when I combine the SPWs with mstransform and rerun I get to the same 'future warnings' as in the previous log file, but then simply get a 'illegal instruction (core dumped)' error. This is similar to what @IanHeywood saw in issue #238 but I am not running on IDIA. Thanks again, |
It means one of the upstream dependencies have been compiled marching to
machine architecture and the shared objects contain
illegal machine instructions. I think we have to run it with gdb to find
out which.
…On Mon, Sep 28, 2020 at 8:02 PM joesbright ***@***.***> wrote:
Just a quick update - when I combine the SPWs with mstransform and rerun I
get to the same future warnings' as in the previous log file, but then
simply get a Illegal instruction (core dumped)' error. This is similar to
what @IanHeywood <https://github.com/IanHeywood> saw in issue #238
<#238> but I am not running on
IDIA.
Thanks again,
Joe
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#409 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB4RE6VQPX6DOHIZC65RGKLSIDFURANCNFSM4R4YL62Q>
.
--
--
Benjamin Hugo
PhD. student,
Centre for Radio Astronomy Techniques and Technologies
Department of Physics and Electronics
Rhodes University
Junior software developer
Radio Astronomy Research Group
South African Radio Astronomy Observatory
Black River Business Park
Observatory
Cape Town
|
I cannot seem to reproduce this error. I have tried creating a 2 band MS with a single channel (using simms) in each and predicting using your sky model. I am not quite sure how to help further unless you can share the data? I understand if that is impossible though. I am running in a python3 (3.6.9) virtualenv with a fresh install of all python dependencies, ignoring cached installs. In theory everything should be up-to-date. |
Out of interest, where are you running this? Is it on a local laptop/desktop? Or is it on a server somewhere? |
Can you post a pip freeze for reference
…On Tue, Sep 29, 2020 at 11:12 AM JSKenyon ***@***.***> wrote:
I cannot seem to reproduce this error. I have tried creating a 2 band MS
with a single channel (using simms) in each and predicting using your sky
model. I am not quite sure how to help further unless you can share the
data? I understand if that is impossible though. I am running in a python3
(3.6.9) virtualenv with a fresh install of all python dependencies,
ignoring cached installs. In theory everything should be up-to-date.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#409 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB4RE6QPRI2TR74SQPV764LSIGQHPANCNFSM4R4YL62Q>
.
--
--
Benjamin Hugo
PhD. student,
Centre for Radio Astronomy Techniques and Technologies
Department of Physics and Electronics
Rhodes University
Junior software developer
Radio Astronomy Research Group
South African Radio Astronomy Observatory
Black River Business Park
Observatory
Cape Town
|
pip freeze
absl-py==0.10.0 |
This is running on a server with multiple nodes. I successfully ran on one of the newer nodes via a singularity shell, but I've also attached a pip freeze from the older node where I was having the issues I mentioned previously. Thanks @JSKenyon and @bennahugo for the help.
|
This is still mysterious, as it was in #238. The fact that it works on one node and not another suggests something system related but I have no instinct for the cause. |
As I mentioned the instruction sets are cpu model specific. If you enable
march-native the SSE/AVX & manufacturer specific instruction sets
are compiled into the assembler code of the binary distributed wheel. It
may work on a range of architectures but not necessarily cross generation
or manufacturer.
This is why you would get SIG ILLEGAL from the kernel.
…On Tue, Sep 29, 2020 at 3:18 PM JSKenyon ***@***.***> wrote:
This is still mysterious, as it was in #238
<#238>. The fact that it works
on one node and not another suggests something system related but I have no
instinct for the cause.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#409 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB4RE6WJN7UPWK6BTSENHWDSIHNDVANCNFSM4R4YL62Q>
.
--
--
Benjamin Hugo
PhD. student,
Centre for Radio Astronomy Techniques and Technologies
Department of Physics and Electronics
Rhodes University
Junior software developer
Radio Astronomy Research Group
South African Radio Astronomy Observatory
Black River Business Park
Observatory
Cape Town
|
You can try --no-binary to forcefully disable wheel building to check
whether the issue arises in the wheels and as a work around.
I will look into this with gdb tomorrow.
…On Tue, Sep 29, 2020 at 3:22 PM Benna Hugo ***@***.***> wrote:
As I mentioned the instruction sets are cpu model specific. If you enable
march-native the SSE/AVX & manufacturer specific instruction sets
are compiled into the assembler code of the binary distributed wheel. It
may work on a range of architectures but not necessarily cross generation
or manufacturer.
This is why you would get SIG ILLEGAL from the kernel.
On Tue, Sep 29, 2020 at 3:18 PM JSKenyon ***@***.***> wrote:
> This is still mysterious, as it was in #238
> <#238>. The fact that it works
> on one node and not another suggests something system related but I have no
> instinct for the cause.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#409 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AB4RE6WJN7UPWK6BTSENHWDSIHNDVANCNFSM4R4YL62Q>
> .
>
--
--
Benjamin Hugo
PhD. student,
Centre for Radio Astronomy Techniques and Technologies
Department of Physics and Electronics
Rhodes University
Junior software developer
Radio Astronomy Research Group
South African Radio Astronomy Observatory
Black River Business Park
Observatory
Cape Town
--
--
Benjamin Hugo
PhD. student,
Centre for Radio Astronomy Techniques and Technologies
Department of Physics and Electronics
Rhodes University
Junior software developer
Radio Astronomy Research Group
South African Radio Astronomy Observatory
Black River Business Park
Observatory
Cape Town
|
Thanks @bennahugo! |
Hi,
I'm running into an issue with cubical when trying to run a G/dE correction. I have a model in my model column and a tigger sky-model (converted from pyBDSM) of the problem source. I've attached the logs below. The error seems to be from montblanc, but I can't immediately see what is causing the issue.
Appreciate any information on this you could share.
Thanks,
Joe
cubical_log.txt
The text was updated successfully, but these errors were encountered: