-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rebuild release files #359
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hrshdhgd the problems with ORDO are still there I think; I only reran the whole pipeline, and all the stuff you added earlier is lost again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hrshdhgd Your changes were undone again by the latest run..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or at least the results look strange.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should all remember @joeflack4 @hrshdhgd to always rerun the whole pipeline when we fix something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I remember these instructions. Thanks gosh I haven't needed to edit mondo-ingest
since because I'm not looking forward to doing this when I do ; ;. But there's no other way I suppose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something is wrong here again :( there should be less than 100, probably less than 20 matches here, not 4K. This is like, it matches nothing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems this is the issue: (some lines in the generated mondo.sssom.tsv)
MONDO:8000015 46,XY sex reversal 11 skos:exactMatch orphanet.ordo:983 semapv:UnspecifiedMatching
MONDO:8000030 obsolete morphological anomaly skos:exactMatch orphanet.ordo:377791 semapv:UnspecifiedMatching
MONDO:8000031 obsolete subtype of a disorder skos:exactMatch orphanet.ordo:557494 semapv:UnspecifiedMatching
MONDO:8000032 obsolete malformation syndrome skos:exactMatch orphanet.ordo:377789 semapv:UnspecifiedMatching
MONDO:8000033 obsolete group of disorders skos:exactMatch orphanet.ordo:557492 semapv:UnspecifiedMatching
MONDO:8000034 obsolete disorder skos:exactMatch orphanet.ordo:557493 semapv:UnspecifiedMatching
@hrshdhgd can you remind me how you solved the ordo.orphanet prefix issue in the Mondo repo? It seems when running https://github.com/monarch-initiative/mondo-ingest/blob/main/src/ontology/mondo-ingest.Makefile#L346, (which is in the mondo repo, not here) we still get the ordo.orphanet prefix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
running make
on $(MAPPINGSDIR)/%.sssom.tsv
gets me correct prefixes. I don't understand what the above line does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So after the mondo repo is cloned, it executes make mappings
which itself executes make $(MAPPINGSDIR)/mondo.sssom.tsv
.
Do you run the command with ODK or with a local configuration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
local configuration. ODK errored out Peak memory: 15008564 kb
as I mentioned below.
What command do you use to run the whole thing. Just curious. |
This is the command for the whole Mondo Ingest!
|
running in local environment.... |
Yeah, I run it over night.. |
I just ran this for 18 hours and saw this error only to start over. |
Lets find a better way to deal with this pipeline; That is definitely not the way (18hrs, holy moly!) |
That is crazy. If none of the inputs changed though, we should have some way of the build picking up where it left off. This might be addressed by: Or by Because if I remember correctly, it just re-downloads the inputs without knowing if they've really changed or not, and then just runs the full pipeline from there. edit: Nico and I discussed, and perhaps there is a phony goal somewhere in the middle that is triggering things to need to be rerun. |
The time to download is not he main problem. That is, depending on the speed of the internet connection, maybe around 20 minutes... I bet one of the main issues is the huge amount of IO (reading/writing enormous ontology files with ROBOT, relation graph, sssom py).. |
What I am saying is not that it takes a long time to download. I'm just wondering if the fact that it forces downloads is triggering other the whole pipeline to run. If the source hasn't changed, then it shouldn't need to rebuild the whole pipeline, but just a (sometimes potentially very small) fraction of it. |
Here's the version differences in the requirements and my virtual environment.
Notice the ones I've starred (*). |
Obsolete now. |
No description provided.