-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Performance] Delay per-job of 0.87 seconds due from capturing EE metadata #11699
Comments
I hear that @nitzmahone may be effectively accomplishing this with the builder 3.0 refactor! |
I'm really torn on how best to handle this- the entrypoint is almost certainly not the right place, since in almost all cases project-embedded collections wouldn't be included (which IIUC was part of the original design expectations). Further- even if you do tell it all the right places to look, We could preserve the existing behavior and get rid of the container startup penalty (at least temporarily) by having builder squirrel this away during the build, then have builder's default entrypoint script conditionally copy the memoized files into the job output instead of running core and Galaxy CLI at every container startup- that wouldn't fix any of the correctness issues, but would maintain the existing behavior. I haven't yet heard what's actually being done with any of this stuff. IIUC at least for AWX,it's stored in the DB per job execution (not per EE definition/JT/whatever), so it seems like capturing the collections that actually loaded during the execution would be much more useful than "here's the collections we found in the EE, they might or might not actually be what got used by the job". That info could probably be captured by |
that did occur to me before. I feel fairly ambivalent about it, it could get the job done.
It's not surfaced to AWX users in any way I know of. But we still have an obligation to collect it. My thought (which I was hoping would be easier and faster) would be to produce this somewhere in the awx_display callback. Because in that case, we could import some of the same stuff used by the Galaxy CLI command, but doing so wouldn't incur the penalty of starting up a new process. Again, I feel indifferent about the exact means by which this happens, but I just want to avoid the current slow/spammy entrypoint commands. This would be a very substantial reduction in the time before the first event. |
Just for giggles, I tried wiring up a collection load event handler on (plugin init timing aside though, the concept worked great- two lines of code in |
I totally get your point about displaying the loaded collections, but I officially don't care either. What we currently get isn't much more than a few I will give a heads up to some people for performance testing, and also try to give a notice to anyone who might be receiving and processing this data to be braced for format/content changes. |
Sounds like we're still struggling to come up with a valid answer to "who's actually using this info?". With that, my current inclination is to assume |
@nitzmahone We discussed this in chat last week, but also capturing here for posterity: awx/awx/main/analytics/collectors.py Line 479 in 4345954
I'm not saying we should keep the existing (broken) implementation, but just dropping it altogether doesn't seem like the right answer either. |
Closing, as this was broken, leading to ansible/ansible-runner#1273 but no longer running this logic in the entrypoint. |
ISSUE TYPE
SUMMARY
To reproduce:
If you use the
AWX_ISOLATED_DATA_DIR
, then you trigger special bash scripts in the entrypoint:https://github.com/ansible/ansible-runner/blob/devel/utils/entrypoint.sh
The time delay comes from two commands
The point of these 2 commands is to save the collection list and the Ansible version. These are saved to the job record, but not presented in the API.
I argue that these would be better associated with execution environments, and as such, this metadata does not need to be collected each job run.
The text was updated successfully, but these errors were encountered: