Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Mobile UCR Versions for Domains (Script) #35158

Draft
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

zandre-eng
Copy link
Contributor

Product Description

No user-facing changes.

Technical Summary

Link to ticket here.

Please note: This PR should not be merged and is only created for review purposes. The script contained in this PR will be run as a once-off script.

This PR is the code for a once-off script that will be run to update the Mobile UCR versions of all domains from V1 to V2. This will only be done for domains that do not have any manual references to V1 fixtures. Domains that do contain V1 fixtures will be flagged and will need to be handled separately.

Feature Flag

MOBILE_UCR

Safety Assurance

Safety story

Local testing

Automated test coverage

N/A

QA Plan

No QA planned.

Labels & Review

  • Risk label is set correctly
  • The set of people pinged as reviewers is appropriate for the level of risk of the change

@zandre-eng zandre-eng added the product/invisible Change has no end-user visible impact label Oct 1, 2024
@zandre-eng zandre-eng added the Open for review: do not merge A work in progress label Oct 1, 2024
Copy link
Contributor

@kaapstorm kaapstorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just three suggestions, nothing blocking. I am happy with your approach.

corehq/mobile_ucr_v2_update_script.py Outdated Show resolved Hide resolved
corehq/mobile_ucr_v2_update_script.py Outdated Show resolved Hide resolved
corehq/mobile_ucr_v2_update_script.py Outdated Show resolved Hide resolved
@gherceg
Copy link
Contributor

gherceg commented Oct 1, 2024

Excited to see this happening! I see there are some tickets (here and here) already created to default to V2, but I can't tell if those changes have been deployed/what the current behavior on production is. It would make sense to me to ensure we default to V2 prior to running this, so just making sure that is the plan or if not, there is a reason why.

@zandre-eng
Copy link
Contributor Author

Thanks for sharing the concern @gherceg! I definitely agree this would be important to do beforehand and can confirm that we do have a plan outlined to first ensure that domains aren't able to go back to V1 before running the migration script from this PR. This also includes the related code changes from this Jira ticket to handle when users are reverting/publishing apps with V1 references.

Copy link
Contributor

@mkangia mkangia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non blocking comments.

Would be nice to have a -dry-run option to the script so it can be run without doing actual updates to see which apps it would update or not update eventually. I have always found that useful, specifically on production


mobile_ucr_domains = find_domains_with_toggle_enabled(MOBILE_UCR)

log_file = open(LOG_FILE, 'a')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this would leave the connection to the file open even after the script exits abruptly, so we should this with with option. This way you also don't need to worry about explicitly calling close.
with open(LOG_FILE, 'a') as log_file:

Does mode 'a' create the file if its missing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking through the code, I can remove the need for passing the log file around to the various helper functions. Since we already know what the file path is, we can just use this directly in the helper function for adding to the log. Addressed in 27c1c25.

Does mode 'a' create the file if its missing?

It does. Here is an article for more info on the append mode.

apps = get_apps_by_id(domain, app_ids)
for app in apps:
try:
# Don't look at app.is_released since the latest version might not be released yet
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirming that we are just picking app and not releases.
I believe one way to confirm that is by checking copy_of property on the application. Is it blank for applications and not blank for releases. it would be a good check to have though I assume you are anyway only picking up applications and not releases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirming that we are just picking app and not releases.

Could you please elaborate on what you mean by app vs release? Are you referring to app versions that have been marked as "Released" or are you referring to the app versions themselves?

This comment refers that we are picking the latest version of each app and we don't want to consider app.is_released since the latest version of an app might still be set to "Test" and not "Released"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you referring to app versions that have been marked as "Released" or are you referring to the app versions themselves?

app versions or app builds.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are getting the applications themselves using get_apps_by_id() which returns a list of Application objects. Since we are only getting the latest release of each app, these retrieved application objects should be the current build of the app.

@mkangia I'm not sure if the above answers your question, but let me know if it doesn't and I'd be happy to offline on this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are only getting the latest release of each app, these retrieved application objects should be the current build of the app.

I think there is a misunderstanding of how apps work.

So, an application is actually a running doc and the versions are saved instances of that application at different points(versions). My understanding is that here we want to update the application and not the latest created version/build.

Seems like we are okay here since you are using get_latest_app_ids_and_versions which is using app_manager/applications_brief couch view which only keeps applications and not builds since it only saves docs that have copy_of set as null

V1_FIXTURE_IDENTIFIER = 'src="jr://fixture/commcare:reports'
V1_FIXTURE_PATTERN = r'<.*src="jr://fixture/commcare:reports.*>'
V1_REFERENCES_PATTERN = r"<.*instance\('reports'\)/reports/.*>"
RE_V1_ALL_REFERENCES = re.compile(f"{V1_FIXTURE_PATTERN}|{V1_REFERENCES_PATTERN}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I trust this has been tested so it matches as needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These regex patterns come from this script which has been used to analyze all the domains with V1 references. Based on the results of the script that was run, these regex patterns have successfully picked up domains with V1 references.

These references were taken from this Confluence page, which shows what needs to change to migrate from V1->V2.

skip_domains = set()


def process():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the flow of code in the file made it a bit hard to follow along. Just reminding on keeping things in a file in logical order. I am big fan on vertical formatting

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see how this might be difficult with all the helper functions at the top. I've refactored this to have the main process function at the top instead eac2bcf.

@zandre-eng
Copy link
Contributor Author

Would be nice to have a -dry-run option to the script so it can be run without doing actual updates to see which apps it would update or not update eventually.

@mkangia I think this is a great idea. I have added a dry run option in 20d9401.

@@ -37,7 +37,7 @@
skip_domains = set()


def process():
def process(dry_run=False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: totally up to you and fine to leave it as it is for this one, I usually keep dry run as True by default, it avoids accidental updates for me, but it is also fine either way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds like a good idea. Addressed in 25a0298.

Copy link
Contributor

@ajeety4 ajeety4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.
Just a non-blocking comment.

Also I think it would be good to test that after updating the app to v2 (through script or manually), it does not break anything. Preferably one of the app from echis domain.

# Don't look at app.is_released since the latest version might not be released yet
if app.mobile_ucr_restore_version != '2.0':
save_in_log(f"Processing App: {domain}: {app.name}: {app.id}")
if not has_non_v2_form(domain, app):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on this

Suggested change
if not has_non_v2_form(domain, app):
if not has_non_v2_reference(domain, app):

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 6dcc719.

Copy link
Contributor

@ajeety4 ajeety4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requested some changes based on latest findings as explained here

@ajeety4
Copy link
Contributor

ajeety4 commented Oct 15, 2024

Hey @zandre-eng ,

Wanted to add one more thing based on earlier script run for finding non-v2 apps.
The django machine for India is quite small (2 GB) and as a result, this script will cause memory error.
So we will need to do some slight modifications before running this for India. The modifications I made are in this gist. I can offline with you on this.

@zandre-eng
Copy link
Contributor Author

Wanted to add one more thing based on earlier script run for finding non-v2 apps.
The django machine for India is quite small (2 GB) and as a result, this script will cause memory error.
So we will need to do some slight modifications before running this for India. The modifications I made are in this gist. I can offline with you on this.

@ajeety4 Ah this is good to know, thanks for flagging. I have made a few adjustments based on the gist you provided (241fd96). Let me know if it looks good, or if there was something else I missed. Thanks!

@ajeety4
Copy link
Contributor

ajeety4 commented Oct 16, 2024

Wanted to add one more thing based on earlier script run for finding non-v2 apps.
The django machine for India is quite small (2 GB) and as a result, this script will cause memory error.
So we will need to do some slight modifications before running this for India. The modifications I made are in this gist. I can offline with you on this.

@ajeety4 Ah this is good to know, thanks for flagging. I have made a few adjustments based on the gist you provided (241fd96). Let me know if it looks good, or if there was something else I missed. Thanks!

Hey @zandre-eng , the changes looks good. Thanks for adding this. FYI for the run, I had set the memory limit to 900 MB.

One more adjustment I had to make to get the successful run for some of the domains was below, basically fetching one app at at time. Could be done now or while running the script.

Instead of below,

        apps = get_apps_by_id(domain, app_ids)
        for app in apps:

I used this

        for app_id in app_ids:
            app = get_apps_by_id(domain, [app_id])[0]

@zandre-eng
Copy link
Contributor Author

FYI for the run, I had set the memory limit to 900 MB.

That's good to know, thanks!

One more adjustment I had to make to get the successful run for some of the domains was below, basically fetching one app at at time. Could be done now or while running the script.

@ajeety4 Was this specifically for the India environment? Doing it in this manner will incur additional DB hits since we'll know need to retrieve each app one at a time. I think if this is specific to only the India env then I'll only let this adjustment run for them and keep the current functionality for the production env. This will help to minimize the amount of DB calls for envs that have a larger working memory.

@ajeety4
Copy link
Contributor

ajeety4 commented Oct 17, 2024

FYI for the run, I had set the memory limit to 900 MB.

That's good to know, thanks!

One more adjustment I had to make to get the successful run for some of the domains was below, basically fetching one app at at time. Could be done now or while running the script.

@ajeety4 Was this specifically for the India environment? Doing it in this manner will incur additional DB hits since we'll know need to retrieve each app one at a time. I think if this is specific to only the India env then I'll only let this adjustment run for them and keep the current functionality for the production env. This will help to minimize the amount of DB calls for envs that have a larger working memory.

Yes specific to the India environment. I agree with keeping the current code for the production environment and doing the adjustment at the run time sounds good.

@zandre-eng
Copy link
Contributor Author

Yes specific to the India environment. I agree with keeping the current code for the production environment and doing the adjustment at the run time sounds good.

@ajeety4 I decided to be more explicit about it and implement code for both workflows so that it's more clear what needs to change for low memory envs (0833b6f).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Open for review: do not merge A work in progress product/invisible Change has no end-user visible impact
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants