Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cached inputs are incongruent with what STRING transformer expects -> FileNotFoundError #473

Open
caufieldjh opened this issue Aug 3, 2023 · 1 comment
Labels
bug Something isn't working

Comments

@caufieldjh
Copy link
Contributor

In #472 , I bumped the STRING input data to v11.5 and changed all references to it...or at least I thought so.
Here's one I missed:

10:11:43  Traceback (most recent call last):
10:11:43    File "run.py", line 202, in <module>
10:11:43      cli()
10:11:43    File "/var/lib/jenkins/workspace/dge-graph-hub_kg-covid-19_master/gitrepo/venv/lib/python3.8/site-packages/click/core.py", line 1157, in __call__
10:11:43      return self.main(*args, **kwargs)
10:11:43    File "/var/lib/jenkins/workspace/dge-graph-hub_kg-covid-19_master/gitrepo/venv/lib/python3.8/site-packages/click/core.py", line 1078, in main
10:11:43      rv = self.invoke(ctx)
10:11:43    File "/var/lib/jenkins/workspace/dge-graph-hub_kg-covid-19_master/gitrepo/venv/lib/python3.8/site-packages/click/core.py", line 1688, in invoke
10:11:43      return _process_result(sub_ctx.command.invoke(sub_ctx))
10:11:43    File "/var/lib/jenkins/workspace/dge-graph-hub_kg-covid-19_master/gitrepo/venv/lib/python3.8/site-packages/click/core.py", line 1434, in invoke
10:11:43      return ctx.invoke(self.callback, **ctx.params)
10:11:43    File "/var/lib/jenkins/workspace/dge-graph-hub_kg-covid-19_master/gitrepo/venv/lib/python3.8/site-packages/click/core.py", line 783, in invoke
10:11:43      return __callback(*args, **kwargs)
10:11:43    File "run.py", line 74, in transform
10:11:43      kg_transform(*args, **kwargs)
10:11:43    File "/var/lib/jenkins/workspace/dge-graph-hub_kg-covid-19_master/gitrepo/kg_covid_19/transform.py", line 66, in transform
10:11:43      t.run()
10:11:43    File "/var/lib/jenkins/workspace/dge-graph-hub_kg-covid-19_master/gitrepo/kg_covid_19/transform_utils/string_ppi/string_ppi.py", line 171, in run
10:11:43      ) as edge, gzip.open(data_file, "rt") as interactions:
10:11:43    File "/usr/lib/python3.8/gzip.py", line 58, in open
10:11:43      binary_file = GzipFile(filename, gz_mode, compresslevel)
10:11:43    File "/usr/lib/python3.8/gzip.py", line 173, in __init__
10:11:43      fileobj = self.myfileobj = builtins.open(filename, mode or 'rb')
10:11:43  FileNotFoundError: [Errno 2] No such file or directory: 'data/raw/9606.protein.links.full.v11.5.txt.gz'
@caufieldjh caufieldjh added the bug Something isn't working label Aug 3, 2023
@caufieldjh
Copy link
Contributor Author

Ah, I hadn't actually missed this one.
The build is failing on attempting to download ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/idmapping/by_organism/HUMAN_9606_idmapping.dat.gz,
so it reverts to using the s3 cache,
which contains older versions of everything,
including the STRING data (with the previous filename and everything).
So there's a FileNotFound since the transform is looking for the newer version.
That's probably an argument for a static filename (like "stringppi.txt.gz" or something) but if I had done that I may not have noticed that the new data wasn't being used.

@caufieldjh caufieldjh changed the title Fix hardcoded path for STRING Cached inputs are incongruent with what STRING transformer expects -> FileNotFoundError Aug 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

1 participant