-
-
Notifications
You must be signed in to change notification settings - Fork 551
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Getting traceback calls & no report is generated. #2908
Comments
@kiranravindran90 Thanks: do you have instructions so we can reproduce your setup? including a full trace? |
@pombredanne Find the full trace & instruction details. Scan Instruction : Package : (~15GB) File types : Trace : |
This is related to: #2160 |
I am facing the same issue. OS: Linux (Debian) Timeout option set to: However, the scan still stops after 3600 seconds (60 minutes). I looked at https://github.com/nexB/scancode-toolkit/blob/develop/src/scancode/pool.py#L52. However, I could not really understand how it works. The first, what is this line actually doing: func(self, timeout=timeout or 3600) What exactly does the statement Second, what is the purpose of the |
@Jeeppler there is a new release of ScanCode and I would like to see if you are also experiencing the issue there. Can you try the latest 31.x?
I reckon it feels a bit arcane. This is a function that returns a function such that calling the inner function will always be subject to a timeout. The code is there because of historical multiprocessing bugs that have been existing in the past and a lot of trial and error to get things working correctly (obviously not enough!):
... long story made short: there are likely some changes and fixes in the code of Python that mean we may not need anymore some of this machinery. The difficult thing here is that we support Windows, macOS and Linux and they have subtly different bugs in their Python implementations at times and not-so-subtle differences on how they each handle forking and spawing processes. We support all of them :| If you think you could help track down these issues, and have a reliably failing test case, I could may be try to simplify the code in a branch and let you test this? |
Since Debian 11 comes with Python 3.9 it would be nice to have an archive for Python 3.9 otherwise I would have to install ScanCode via
I would love to help fix the bug, it includes coding something as well. However, I was unable so far to find public test cases/files to test it with. The files I use, I am not allowed to share. In case you have an idea what type of binaries could work let me know. I am willing to test them. |
Related to: #2703 |
Can I disable the timeout if I set it to |
I was able to reproduce the issue with the Armbian image of the Jetson Nano:
The image takes more than an hour to scan and on Debian 11 with Scancode-Toolkit version: 30.1.0. I get the following output:
I set |
@Jeeppler Thanks for digging further. What should be the default max size above which it does not make sense to attempt scanning? 100MB? 200MB? |
(FWIW, the uncompressed disk image above is 1.7GB ;) ) |
you can but that may make all fail. You can make it super short as 0.001 second if you want |
No, no skipping of the file. The requirement is to scan files in that size or bigger for license information. It takes hours, but Scancode can find license information in files of that size. In case, you have more questions about the use case you can reach me via email using the email address located in this file: https://github.com/mercedes-benz/sechub/blob/develop/MAINTAINERS.md |
The problem is, that even if I set the I encountered the problem under Linux. It apparently works under Windows. |
I ran the test again with the new version of Setup:
I installed
Scancode was running 60 minutes, before raising the |
@Jeeppler Thanks.... clearly we need to revisit the timeout story! Thank you for all these efforts. |
@Jeeppler BTW https://github.com/mercedes-benz/sechub looks awesome! |
OK. Let's have a issue to track this. Do you mind to enter one? This should be easy enough now that we automated the whole release build. An alternative could be an appimage of sorts. See also #2836
This should work fine and is tested on each build using the latest version of all deps with https://github.com/nexB/scancode-toolkit/blob/ded56e9120f5fdfb9a1a0309130bb4305a66aacb/azure-pipelines.yml#L194 If you want to have the deps "same-as-the-app", then something such as this may be better as it will use the same versions as the app:
|
Thanks for explaining what the difference between the package archive and the Furthermore:
|
I did another test with the Armbian image: After a few minutes (approx. 3-5), all the memory and the entire swap partition are used. See attached screenshots. Then suddenly, CPU utilization and memory drop. Looking at the processes. The
Scan command used:
|
I see the same pattern with a 64 GB machine. Is there are way to profile/investigate it? |
I was able to confirm: I am not sure what the issue is with |
Each whole file is loaded in memory indeed |
@pombredanne I am happy to submit a patch/PR and fix this issue. However, my current solution is to overwrite My expectation is, that the fix allows the user to set the value using the |
@pombredanne No sorry I cannot test the changes unless there is a ScanCode release of it (I get ScanCode from PIP when building the Docker image). |
@nnobelis we install the Scancode using
|
Thanks @Jeeppler. I will have a look when I have capacity. |
Getting traceback calls & no report is generated.
How To Reproduce
While scanning source code files, after encountering a .sh file, this error seemed to be popping out (image below).
Many Traceback calls were followed by a "Multiprocessing Context Timeout Error"
System configuration
The text was updated successfully, but these errors were encountered: