Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: update ci caching logic #207

Merged
merged 3 commits into from
May 15, 2023
Merged

fix: update ci caching logic #207

merged 3 commits into from
May 15, 2023

Conversation

laurentS
Copy link
Contributor

Fixes (hopefully) #119.

See details in the issue. Hopefully this PR's tests should already see the benefit of the cache.

@laurentS laurentS requested a review from ericboucher May 12, 2023 13:41
@laurentS laurentS changed the title Update ci caching logic fix: update ci caching logic May 12, 2023
@sonarcloud
Copy link

sonarcloud bot commented May 12, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

@laurentS
Copy link
Contributor Author

@edgarrmondragon I'm wondering if the github api token used for the CI here is also used somewhere else. The changes in this PR seem effective, the last run took 35 minutes, 28 of which were waiting for new quotas. But everything was consumed within about 10 minutes. By my calculation (running tests locally), the full CI run with 5 python versions should use about 2500 REST requests, and this didn't happen here, we had much less actually happening (only 1 python version on 1 PR, caching on the other). Are you able to check if the token's rate allowance is eaten up by something else? CI runs took 3-4hrs over the past few days, which makes it very hard to work on the repo.

@edgarrmondragon
Copy link
Member

@laurentS It's probably easiest to regenerate the token and document the permissions. Do you know the minimum set of permissions required for the tap to run CI?

@laurentS laurentS merged commit 203b05e into main May 15, 2023
@laurentS laurentS deleted the update-ci-caching-logic branch May 15, 2023 14:46
@laurentS
Copy link
Contributor Author

@laurentS It's probably easiest to regenerate the token and document the permissions. Do you know the minimum set of permissions required for the tap to run CI?

@edgarrmondragon Yes, let's try this.
From what I gather in the code, we use 2 env vars for testing:

  • GITHUB_TOKEN has public_repo permissions
  • ORG_LEVEL_TOKEN has public_repo and read:org permissions (this is used on some tests which are skipped if this value is not set in the env)

That seems to be enough to get all tests to pass locally, so I suppose it should be the same with CI. It'd be great if you're able to rotate tokens to understand what is eating quotas so quickly.

@edgarrmondragon
Copy link
Member

@laurentS It's probably easiest to regenerate the token and document the permissions. Do you know the minimum set of permissions required for the tap to run CI?

@edgarrmondragon Yes, let's try this. From what I gather in the code, we use 2 env vars for testing:

* `GITHUB_TOKEN` has `public_repo` permissions

* `ORG_LEVEL_TOKEN` has `public_repo` and `read:org` permissions (this is used on some tests which are skipped if this value is not set in the env)

That seems to be enough to get all tests to pass locally, so I suppose it should be the same with CI. It'd be great if you're able to rotate tokens to understand what is eating quotas so quickly.

GITHUB_TOKEN is generated for every run so it shouldn't be competing for resources.

I've updated ORG_LEVEL_TOKEN with read-only permissions for the MeltanoLabs org repos. Let me know if there's any issues, otherwise I'll open a PR to document the process for rotating the token.

@laurentS
Copy link
Contributor Author

GITHUB_TOKEN is generated for every run so it shouldn't be competing for resources.

I was going to say the rate limits are per account, not per token, but it's actually worse than that (from the docs):

When using GITHUB_TOKEN, the rate limit is 1,000 requests per hour per repository.

I think this explains why we're hitting limits so often. My (very rough) calculation above was that we use ~500 REST calls per python version (before this PR). If caching worked perfectly, then we'd hit the limit just once per day, but considering different branches and how they don't share cache, it's likely that we hit problems more frequently. I'll see if I can modify the tests a bit to use fewer requests per run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants