Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing .tch file '/da5_fast/f2cFullU.13.tch', also no python3 on da5 #2

Open
jnareb opened this issue Oct 13, 2022 · 10 comments
Open
Assignees

Comments

@jnareb
Copy link

jnareb commented Oct 13, 2022

When trying to find all commits that changed given file (and then all projects that included given file at some point), I have tried to follow the example from oscar.py documentation: https://ssc-oscar.github.io/oscar.py/

import oscar

commits = oscar.File('minicms/templatetags/minicms_tags.py').commit_shas

Running the script with python3 on 'da0' gave an error, and a suggestion about running it on 'da4', namely:

_frozen_importlib:219: UserWarning: Commit and tree direct content is only available on da4.
Some functions might not work as expected.

But when logging in to 'da4', and re-running the script, I got the following error:

OSError: Failed to close .tch "b'/da5_fast/f2cFullU.13.tch'": file not found
Exception ignored in: 'oscar.Hash.__del__'
OSError: Failed to close .tch "b'/da5_fast/f2cFullU.13.tch'": file not found
Traceback (most recent call last):
  File "./script.py", line 9, in <module>
    commits = oscar.File('minicms/templatetags/minicms_tags.py').commit_shas
  File "oscar.pyx", line 344, in oscar.cached_property.wrapper
  File "oscar.pyx", line 1578, in oscar.File.commit_shas
  File "oscar.pyx", line 574, in oscar._Base.read_tch
  File "oscar.pyx", line 523, in oscar._get_tch
  File "oscar.pyx", line 459, in oscar.Hash.__cinit__
OSError: Failed to open .tch file "b'/da5_fast/f2cFullU.13.tch'": file not found

I have checked that I am using oscar version 2.2.1, which is the newest release.

I have tried to log in onto 'da5', based on the pathname of this non-existent file, but there is no python3 installed on 'da5':

jnareb@da5:~> python --version
Python 2.7.5
jnareb@da5:~> python3 --version
-bash: python3: command not found
@audrism
Copy link
Contributor

audrism commented Oct 14, 2022

ls /da?_fast/f2c*.0.tch
/da5_fast/f2cFullT.0.tch

f2c random lookup is little used as typically one greps for right file patterns in zcat /da?_data/basemaps/gz/f2c*.s (or c2f) output.

As such no update since version T was done. If you have a use case for f2c random lookup, please let me know and might create version U.

RedHat has some suggestions for py3, but I am not sure it works, so better use da4:

$ scl enable rh-python36 bash
$ python3 -V
Python 3.6.3

$ python -V  # python now also points to Python3 
Python 3.6.3

$ mkdir ~/pydev
$ cd ~/pydev

$ python3 -m venv py36-venv
$ source py36-venv/bin/activate

(py36-venv) $ python3 -m pip install ...some modules...

@jnareb
Copy link
Author

jnareb commented Oct 14, 2022

The problem is that example code for oscar.File for oscar.py / Python API involving File does not work, at least when using oscar version 2.2.1

Below there is the example in question (with from oscar import File added):

>>> from oscar import File
>>> commits = File('minicms/templatetags/minicms_tags.py').commit_shas
>>> len(commits) > 0
True

But it does not work:

jnareb@da4:~> python3
Python 3.6.8 (default, Nov 16 2020, 16:55:22)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-44)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from oscar import File
>>> commits = File('minicms/templatetags/minicms_tags.py').commit_shas
OSError: Failed to close .tch "b'/da5_fast/f2cFullU.13.tch'": file not found
Exception ignored in: 'oscar.Hash.__del__'
OSError: Failed to close .tch "b'/da5_fast/f2cFullU.13.tch'": file not found
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "oscar.pyx", line 344, in oscar.cached_property.wrapper
  File "oscar.pyx", line 1578, in oscar.File.commit_shas
  File "oscar.pyx", line 574, in oscar._Base.read_tch
  File "oscar.pyx", line 523, in oscar._get_tch
  File "oscar.pyx", line 459, in oscar.Hash.__cinit__
OSError: Failed to open .tch file "b'/da5_fast/f2cFullU.13.tch'": file not found

Should I file error against oscar, in https://github.com/ssc-oscar/oscar.py ?

I have installed current version of oscar from the repository (with 4 commits since 2.2.1) to my user account by running the following command in the oscar.py repository:

python3 setup.py build_ext
python3 setup.py install --user

I have checked and I am running this new version. But I still get the same error as described above.

@audrism
Copy link
Contributor

audrism commented Oct 14, 2022

See the file map issue: f2c is available for version T only.
oscar.pyc only uses current version (U) which is absent.
getValues, on the other hand, checks if prior versions exist.

Do you really need f2c?

@jnareb
Copy link
Author

jnareb commented Oct 14, 2022

See the file map issue: f2c is available for version T only.
oscar.pyc only uses current version (U) which is absent.
getValues, on the other hand, checks if prior versions exist.

So ultimately this is to be considered a bug in oscar.py?

I need f2c to be able to, for example, find all repositories that contain the requirements.txt file. This, as far as I understand, needs f2c and c2p / c2P, isn't it?

I'd like to use Python API via oscar module, and do not worry about details of the implementation.

@audrism
Copy link
Contributor

audrism commented Oct 14, 2022

You may want to get not just requirements.txt in the root folder, or, perhaps also Requirements.txt or
requirements.md.

As such zcat /da?_data/basemaps/gz/c2fFullU*.s | grep -i 'requirements.' may be a more accurate way
for you to get what you need.

Perhaps you have another use case for f2c?

@jnareb
Copy link
Author

jnareb commented Oct 14, 2022

Also, it might be interesting to examine changes to requirements.txt that were made due to some CVE (based on the commit message, or on external data such as CVE or NVD database).

@audrism
Copy link
Contributor

audrism commented Oct 14, 2022

Perhaps you misunderstood: I completely agree that it makes sense to investigate certain classes of files.

What I am trying to say that f2c is not a reliable way to do that as filenames may be spelled in different ways and be in different folders.

@jnareb
Copy link
Author

jnareb commented Oct 14, 2022

Yes (somewhat) and no.

With files that are mainly to be processed by automation tools such as dependency management systems, filename must be spelled in specific way - for example for pip it must be requirements.txt, for Gradle it must be build.gradle, etc.

Usually those files are also put in specific directory (root or subdirectory with a specific name), but for projects composed of many individual independent modules (which I assume is rather rare).

Anyway, is this problem to be considered oscar.py bug, or lack of necessary feature?

@audrism
Copy link
Contributor

audrism commented Oct 15, 2022

The use cases you describe all argue against the key-value storage approach as value for the keys of interest would be extremely large and could not be stored in the similar way as for the uncommon file names. It would also cause major problems for both server trying to return the value and the client trying to receive it (e.g., README.md).

Unless some legitimate use cases materialize, it suggests that File class and f2any random lookups should be removed from oscar.py

@jnareb
Copy link
Author

jnareb commented Oct 15, 2022

I don't quite understand your objections, and why you want to remove functionality from oscar.py (and limit areas of the research).

First, one can also find purpose for finding rarely encountered files, like .gitattributes.

Second, I would assume that those File accessors that produce generators, like .commits, would not require storing all the results in the memory, but would generate it on the fly. Or am I wrong here?

As to how oscar.py should in my opinion be improved:

  • make it use the same heuristic as getValues from lookup, that is checks if prior versions of f2c exist and use it
  • mark some accessors / classes as deprecated, and describe alternative means of finding the information in the documentation, be it using shell (zcat / zgrep), or via Python subprocesses

Let's move this discussion to ssc-oscar/oscar.py#50

P.S. As for the problem of generating excessive load on the server, isn't it what job schedulers (also known as workload managers) are for?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants