Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move repo health job from Jenkins to GHA #66

Closed
jmbowman opened this issue Sep 23, 2022 · 15 comments
Closed

Move repo health job from Jenkins to GHA #66

jmbowman opened this issue Sep 23, 2022 · 15 comments
Assignees
Labels
Arbi-BOM Tasks that Arbi-BOM is likely to undertake

Comments

@jmbowman
Copy link

It seems like a good time to move the repo health dashboard generation job from Jenkins to GitHub Actions:

  • The last several runs have timed out at 90 minutes, but GitHub Actions are allowed to run for up to 6 hours
  • tCRIL would like to start running a public version of the job, and doesn't have a Jenkins server.
  • We only have a few people who understand how Jenkins jobs work, while more and more people are learning to understand and write GitHub Actions

Tentatively I'd like to put the bulk of the logic in a workflow template in https://github.com/openedx/.github. That would let 2U and tCRIL each collect the data for their own repositories (and other organizations could use it as well). The edX usage of the template should reside in https://github.com/edx/repo-health-data (which already has a job to update the Google Sheet when new data is committed).

Once we verify that the new job is working correctly, we should remove the Jenkins job and the code that implements it.

@jmbowman jmbowman moved this to Todo in Arbi-BOM Sep 23, 2022
@aht007 aht007 moved this from Todo to In Progress in Arbi-BOM Oct 10, 2022
@UsamaSadiq UsamaSadiq moved this from In Progress to Todo in Arbi-BOM Oct 20, 2022
@jmbowman jmbowman added the prioritizable Top-level issue of a project for prioritization label Oct 24, 2022
@jmbowman jmbowman moved this to Backlog in Platform-Core Roadmap Oct 24, 2022
@UsamaSadiq UsamaSadiq self-assigned this Nov 9, 2022
@UsamaSadiq UsamaSadiq moved this from Todo to In Progress in Arbi-BOM Nov 9, 2022
@UsamaSadiq
Copy link
Member

UsamaSadiq commented Nov 14, 2022

Hi @jmbowman,
After initial investigation into the jenkins job and the description provided by you, I was able to formulate following strategy/points for this task. I would like to confirm following approach with you before starting the actual work on this issue so we would be on the same page.

  • Currently the Jenkins job/bash script is present in the jenkins-job-dsl repo and has a private wrapper inside jenknins-job-dsl-internal which puts our builds behind tools-jenkins.edx.org vpn.
  • We'll need to replace the bash script with an alternate GitHub Action template (script would be used by the template if possible) which will be placed inside openedx/.github.
  • The GitHub workflow template will be getting the org names list, access token and the target repo to save reports via input values.
  • edx/repo-health-data will contain the workflow script being used by 2U which'll trigger the template from openedx/.github with org=edx, target_repo=repo-health-data and the respective access_token to update the respective reports.
  • openedx/.github will contain another workflow which'll be used by the tcrill team to run the template workflow for org=openedx with corresponding values.

Now the concerns I have are following:

  • Is this the desired/optimal approach for our job architecture?
  • Do we need to create/provide an alternate repo like repo-health-data for opensource/tcrill teams or will they be using the same repo-health-data repo to populate the data for now?

CC: @awais786 @iamsobanjaved

@iamsobanjaved iamsobanjaved moved this from Backlog to 2022 Q4 in Platform-Core Roadmap Nov 23, 2022
@jmbowman
Copy link
Author

Sorry for the late reply; yes, that sounds about right. The one change from what you wrote is that the workflow in edx/repo-health-data will collect data for both the edx and openedx orgs (so something like orgs=edx,openedx). That means we don't need to create a separate repo for the openedx data at this time, tCRIL can create that themselves once the workflow has been created.

I suspect that we'll eventually want to run different sets of checks for the public and private workflows, for example adding data about security warnings or the state of 2U production deployments to the private one. That's one reason for keeping a private copy of the openedx repo health data.

@jmbowman jmbowman removed prioritizable Top-level issue of a project for prioritization Arbi-BOM Tasks that Arbi-BOM is likely to undertake labels Dec 5, 2022
@UsamaSadiq
Copy link
Member

Created PR openedx/.github#52 to add both the reusable-repo-health-job-workflow and a template workflow for the orgs to trigger and use the reusable workflow.

@UsamaSadiq UsamaSadiq added the Arbi-BOM Tasks that Arbi-BOM is likely to undertake label Jan 16, 2023
@UsamaSadiq UsamaSadiq moved this from In Progress to In Code Review in Arbi-BOM Jan 16, 2023
@jmbowman jmbowman moved this from Author Team Review to In Progress in Arbi-BOM Feb 1, 2023
@jmbowman jmbowman moved this from 2022 Q4 to 2023 Q1 in Platform-Core Roadmap Feb 6, 2023
@UsamaSadiq UsamaSadiq moved this from In Progress to Author Team Review in Arbi-BOM Feb 22, 2023
@UsamaSadiq
Copy link
Member

Created PR openedx/edx-repo-health#349 to move script to edx-repo-health.

@UsamaSadiq UsamaSadiq moved this from Author Team Review to Owner Review in Arbi-BOM Feb 23, 2023
@UsamaSadiq UsamaSadiq moved this from Owner Review to Approved in Arbi-BOM Mar 6, 2023
@UsamaSadiq
Copy link
Member

UsamaSadiq commented Mar 13, 2023

The reusable workflow and the template trigger workflow have been added to the openedx/.github repo. The actual bash script being used has been added in the edx-repo-health repo.

Now following steps will be completed one by one to complete this task:

  • A PR will be created to create the trigger workflow in the 2U org which will use this reusable workflow to populate the existing google sheet.
  • An SRE ticket will be created to copy the existing credentials from Jenkins env to edx org secrets which will be used by the workflow.
  • An announcement will be made for the open edx community to inform about this tool and list down the steps needed for the other orgs to be able to use this for their reops.

@UsamaSadiq
Copy link
Member

Created SRE ticket to move credentials from Jenkins env to GitHub Actions.

@UsamaSadiq UsamaSadiq moved this from In Progress to Author Team Review in Arbi-BOM Mar 20, 2023
@UsamaSadiq UsamaSadiq moved this from Author Team Review to Blocked in Arbi-BOM Mar 27, 2023
@UsamaSadiq
Copy link
Member

Blocked on SRE currently to copy the credentials before merging the final workflow.

@UsamaSadiq UsamaSadiq moved this from Blocked to In Progress in Arbi-BOM Mar 29, 2023
@UsamaSadiq UsamaSadiq moved this from In Progress to Author Team Review in Arbi-BOM Mar 31, 2023
@UsamaSadiq
Copy link
Member

Credentials have been copied over by the SRE team. Now the final PR in repo-health-data repo is under team review which will be the last PR under this issue.

@adzuci
Copy link

adzuci commented Apr 24, 2023

@UsamaSadiq, @ohnickmoy and I were reviewing what you said in this issue and the code called from https://github.com/openedx/.github/blob/master/.github/workflows/repo-health-job.yml#L23, could you clarify where the two workflows you mentions would both live?

@UsamaSadiq
Copy link
Member

UsamaSadiq commented May 2, 2023

Hi @adzuci @ohnickmoy
Since the testing is in-progress right now so arbi-bom hasn't announced the change for the community yet but I'll try to summarise the workflows' location and the usage instructions.

  1. The reusable repo health job workflow is present inside the openedx/.github repo and it can be referenced from any repo by any organisation using the provided template.
  2. The reusable workflow triggers the bash script to run the repo health checks which is present in the openedx/edx-repo-health repository.
  3. arbi-bom is currently testing the trigger workflow present in the edx/repo-health-data repository which will, once completely tested, replace the currently running Jenkins job to create the repo health dashboard with all the data from the repos under the edx and openedx organisations.

Note:
Since the repo health dashboard will be accessible by the axim team as well so there will be no need to add any separate workflow call under the openedx organisation.
The added workflow template can be used by the open-source communities if they want to use this tool to create a similar dashboard for their repos but they'll have to provide their own secret keys and google sheet credentials as mentioned in the template.

@jmbowman
Copy link
Author

jmbowman commented Jun 1, 2023

I added suggested implementation steps to the issue for setting up an Axim/public version of the repo health data in openedx/axim-engineering#530 (comment) , you can add another comment there with clarification if I missed anything.

@UsamaSadiq
Copy link
Member

Attempting to run the repo health job from GitHub Actions from repo-health-data repo where our trigger workflow lies: https://github.com/edx/repo-health-data/actions/runs/5319878094/jobs/9632904651.

The latest workflow run now runs successfully for multiple organisations and creates yaml files for all the repositories under each organisation. See the result of sample job execution on a custom branch bom-test for 2 repositories: https://github.com/edx/repo-health-data/compare/master...bom-test?diff=unified

It has following two issues right now

  1. The final step to write_squashed_metadata_to_sqlite fails when trying to get the parsed data of the repositories. (Currently investigating the reason for the failure).
  2. The check_ownership check is failing when gspread package’s utils is being used to fetch the credentials file using the given credentials (from secrets) at which point, the function returns a whole dictionary instead of a key as file name and fails the check.

@rgraber could you help me in investigating the failure with the check_ownership check? It seems related to the secrets we added in the environment under https://2u-internal.atlassian.net/browse/DOS-3590.

FYI @jmbowman

@UsamaSadiq
Copy link
Member

Update: The check_ownership check has been fixed after updating the google spreadsheet credential to one which has correct read rights for the sheet. For details, see the PR https://github.com/edx/repo-health-data/pull/162.

@iamsobanjaved iamsobanjaved moved this from In Progress to Author Team Review in Arbi-BOM Jul 25, 2023
@UsamaSadiq
Copy link
Member

Created the follow up issue to resolve the failing sqlite check openedx/edx-repo-health#405

@UsamaSadiq UsamaSadiq moved this from Author Team Review to Approved in Arbi-BOM Jul 26, 2023
@jmbowman jmbowman moved this from 2023 Q1 to 2023 Q3 in Platform-Core Roadmap Jul 27, 2023
@UsamaSadiq
Copy link
Member

Announced the new tool to the community following the communication guidelines [2U-internal document]. Shared with community through the blog post announcement.

@github-project-automation github-project-automation bot moved this from Approved to Done in Arbi-BOM Sep 6, 2023
@jristau1984 jristau1984 moved this from Done to Done - Long Term Storage in Arbi-BOM Sep 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Arbi-BOM Tasks that Arbi-BOM is likely to undertake
Projects
Status: Done - Long Term Storage
Status: 2023 Q3
Development

No branches or pull requests

3 participants