Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log Sidekiq worker arguments #3386

Closed
wants to merge 4 commits into from
Closed

Conversation

mayorova
Copy link
Contributor

Adds more logging to Sidekiq, printing the arguments passed to the worker's perform method.

Compare before:

2023-05-25T21:05:48.076Z 3819931 TID-155ikb BillingWorker JID-d1c638b3fc271bed152893d8 BID-sXf1y0rExpEe2A INFO: start
2023-05-25T21:05:52.773Z 3819931 TID-155ikb BillingWorker JID-d1c638b3fc271bed152893d8 BID-sXf1y0rExpEe2A INFO: done: 4.697 sec

and after:

2023-05-25T21:36:29.556Z 3827198 TID-u08li BillingWorker JID-9423272bd99e4d7d3562869c BID-Qrr4OiSp2F2e9Q ARGS-[2445566778899, 2445599887766, "2023-05-25T00:00:00Z"] INFO: start
2023-05-25T21:36:34.364Z 3827198 TID-u08li BillingWorker JID-9423272bd99e4d7d3562869c BID-Qrr4OiSp2F2e9Q ARGS-[2445566778877, 2445599887766, "2023-05-25T00:00:00Z"] INFO: done: 4.808 sec

josemigallas
josemigallas previously approved these changes May 26, 2023
Copy link
Contributor

@akostadinov akostadinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its super great! I remember looking into this and seeing that a big monkey patch is required so I left it for later :)

Can we have a test case to make sure we don't break the log after sidekiq upgrade?

Here's an example test case for log messages:
https://github.com/3scale/porta/pull/3355/files#diff-e72e319062abc990bd308af9f5f03883c4ea52673979e0818fc8b1282d189493

@mayorova
Copy link
Contributor Author

Can we have a test case to make sure we don't break the log after sidekiq upgrade?

Well, the thing is - we will for sure break it after sidekiq upgrade 🙃
But yes, a failing test will remind us to change the approach. I'll try to add one.

Copy link
Contributor

@nidhi-soni1104 nidhi-soni1104 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

nidhi-soni1104
nidhi-soni1104 previously approved these changes May 30, 2023
Copy link
Contributor

@nidhi-soni1104 nidhi-soni1104 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@akostadinov
Copy link
Contributor

akostadinov commented May 30, 2023

I added a skeleton of a test. It only needs some logic to ignore uninteresting messages and validate the messages we care about.

The thing is that I believe we need to handle ActiveJob workers better. Presently the log for such worker looks like this (I've got the string when running the skeleton test as an error, because this message string is unhandled)

Enqueued ActiveJob::QueueAdapters::SidekiqAdapter::JobWrapper#68df5b02bafe791074e79dbb with args: [{'job_class'=>'PublishEnabledChangedEventForProviderApplicationsWorker', 'job_id'=>'19024126-7139-4a61-a983-89e8723d6d6e', 'provider_job_id'=>nil, 'queue_name'=>'default', 'priority'=>nil, 'arguments'=>[{'_aj_globalid'=>'gid://system/Account/4'}, 'created'], 'executions'=>0, 'locale'=>'en'}]

I don't have time to work on this more this week as I will be out of office Thursday and Friday. So leaving it here for now.

@mayorova
Copy link
Contributor Author

@akostadinov Thanks for looking at it!

I've updated the test here: fedcd2b

I guess that was your intention?

I picked a different worker, because BillingWorker is supposed to be run as part of a batch, so it fails when trying to update the batch summary.

But here is the thing - these logs (Enqueued ... with args) were already there, they are printed when the jobs are enqueued.
But what my PR adds is printing the job arguments (together with its ID) when the job is picked for processing by the worker. This is done by server middleware, so Sidekiq inline testing is not helpful here.

Also, the server middleware doesn't use Rails.logger, instead, it uses Sidekiq.logger.

@mayorova
Copy link
Contributor Author

@akostadinov what about that ActiveJob log?... I think it's OK.
Certainly, it is a bit different from "pure" Sidekiq job enqueue log, but I think that's fine. It might be useful to distinguish them.

@jlledom
Copy link
Contributor

jlledom commented May 31, 2023

Some thoughts:

First, I'm concerned about the logs size after this change, The AuditedWorker logs are specially long.

This is before this PR:

2023-05-31T07:42:03.390Z 15230 TID-1wl1i6 AuditedWorker JID-8226eb1dac0275ca0d5da82d INFO: done: 0.043 sec

And this is the same log after checking out this branch:

2023-05-31T07:10:35.189Z 8431 TID-2611an AuditedWorker JID-7c95415f3a7ef2e3e7b8e95f 
ARGS-[{"id"=>nil, "auditable_id"=>3, "auditable_type"=>"Settings", "user_id"=>2, 
"user_type"=>"User", "username"=>nil, "action"=>"create", "version"=>0, "created_at"=>nil, 
"tenant_id"=>nil, "provider_id"=>2, "kind"=>"Settings", "audited_changes"=>{"account_id"=>4, 
"bg_colour"=>nil, "link_colour"=>nil, "text_colour"=>nil, "created_at"=>"2023-05-31 07:10:33 
UTC", "menu_bg_colour"=>nil, "link_label"=>nil, "link_url"=>nil, "welcome_text"=>nil, 
"menu_link_colour"=>nil, "content_bg_colour"=>nil, "tracker_code"=>nil, "favicon"=>nil, 
"plans_tab_bg_colour"=>nil, "plans_bg_colour"=>nil, "content_border_colour"=>nil, 
"forum_enabled"=>false, "app_gallery_enabled"=>false, "anonymous_posts_enabled"=>false, 
"signups_enabled"=>true, "documentation_enabled"=>true, "useraccountarea_enabled"=>true, 
"refund_policy"=>nil, "privacy_policy"=>nil, "monthly_charging_enabled"=>true, 
"token_api"=>"default", "documentation_public"=>true, "forum_public"=>false, 
"hide_service"=>nil, "cc_terms_path"=>"/termsofservice", "cc_privacy_path"=>"/privacypolicy", 
"cc_refunds_path"=>"/refundpolicy", "change_account_plan_permission"=>"request", 
"strong_passwords_enabled"=>false, "change_service_plan_permission"=>"request", 
"can_create_service"=>false, "spam_protection_level"=>"none", "tenant_id"=>nil, 
"multiple_applications_switch"=>"denied", "multiple_users_switch"=>"denied", 
"finance_switch"=>"denied", "multiple_services_switch"=>"denied", "groups_switch"=>"denied", 
"account_plans_switch"=>"denied", "authentication_strategy"=>"oauth2", "janrain_api_key"=>nil,
 "janrain_relying_party"=>nil, "service_plans_switch"=>"denied", "public_search"=>false, 
"product"=>"connect", "branding_switch"=>"denied", "monthly_billing_enabled"=>true, 
"cms_token"=>nil, "cas_server_url"=>nil, "sso_key"=>nil, "sso_login_url"=>nil, 
"cms_escape_draft_html"=>true, "cms_escape_published_html"=>true, "heroku_id"=>nil, 
"heroku_name"=>nil, "setup_fee_enabled"=>false, "account_plans_ui_visible"=>false, 
"service_plans_ui_visible"=>false, "skip_email_engagement_footer_switch"=>"denied", 
"web_hooks_switch"=>"denied", "iam_tools_switch"=>"denied", 
"require_cc_on_signup_switch"=>"denied", "enforce_sso"=>false}, "comment"=>nil, 
"associated_id"=>nil, "associated_type"=>nil, "remote_address"=>"127.0.0.1", 
"request_uuid"=>"f1dcce2f-ee47-4398-8068-e0f3f5d71d47"}] INFO: start

I think we should add a release note to warn customers that their logs are going to increase much faster. So they will need to update their log rotation policies and either provide more storage or keep the logs less time.

On the other hand, if collectiveidea/audited#674 gets implemented, we'll have truncated logs that could not be useful. In fact I wonder if it's worth to log the parameters for AuditedWorker when we already have the audit logs in the DB.

Would it be hard to exclude some particular Workers from auditing arguments?

@akostadinov
Copy link
Contributor

I remember seeing in the past that wrapped jobs (active job workers) has a special logging not to be that ugly but maybe not everywhere. ok.

Good point @jlledom , maybe we need to limit log message size. We can have audit messages to be huge. And in fact, if we log all parameters, then there is no point to actually put the audit in the audits table...

@mayorova
Copy link
Contributor Author

mayorova commented May 31, 2023

Actually, we need to review the "Enqueue" logs as well. For example, here is what we get in the logs a lot:

[ActiveJob] Enqueued SphinxAccountIndexationWorker (Job ID: 6e267419-e171-494d-92fe-1e64e4d6f668) to Sidekiq(default) with arguments: Account(id: integer, org_name: string, org_legaladdress: string, created_at: datetime, updated_at: datetime, provider: boolean, buyer: boolean, country_id: integer, provider_account_id: integer, domain: string, telephone_number: string, site_access_code: string, credit_card_partial_number: string, credit_card_expires_on: date, credit_card_auth_code: string, master: boolean, billing_address_name: string, billing_address_address1: string, billing_address_address2: string, billing_address_city: string, billing_address_state: string, billing_address_country: string, billing_address_zip: string, billing_address_phone: string, org_legaladdress_cont: string, city: string, state_region: string, state: string, paid: boolean, paid_at: datetime, signs_legal_terms: boolean, timezone: string, delta: boolean, from_email: string, primary_business: string, business_category: string, zip: string, extra_fields: text, vat_code: string, fiscal_code: string, vat_rate: decimal, invoice_footnote: text, vat_zero_text: text, default_account_plan_id: integer, default_service_id: integer, credit_card_authorize_net_payment_profile_token: string, tenant_id: integer, self_domain: string, s3_prefix: string, prepared_assets_version: integer, sample_data: boolean, support_email: string, finance_support_email: string, billing_address_first_name: string, billing_address_last_name: string, email_all_users: boolean, partner_id: integer, hosted_proxy_deployed_at: datetime, po_number: string, state_changed_at: datetime), 2441234567891

Basically, printing out all of the Account class attributes 🙃 which, I think, is completely useless.

@akostadinov
Copy link
Contributor

Maybe another option is to log the stuff we specifically care about and divert to super for the rest. Will save us troubles and allow us to add more stuff if we need in the future.

@mayorova
Copy link
Contributor Author

That's a great point @jlledom !
I was concerned about some arguments being too long, but I actually did not check whether any of the workers would have long args.

I think we can truncate the args array for all workers. Or as you say maybe exclude some workers.

@akostadinov
Copy link
Contributor

My vote is to custom log specific workers and super the rest. I wonder if the args can be parsed by elasticsearch.

@github-actions
Copy link

This PR is stale because it has not received activity for more than 14 days. Remove stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the Stale label Jun 15, 2023
@github-actions github-actions bot closed this Jun 23, 2023
@mayorova mayorova deleted the log-sidekiq-worker-arguments branch October 3, 2023 10:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants