You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I noticed that shepherd is doing a bunch of Github calls and getting a bunch of data that's mostly just discarded. Wanted to float the idea of switching to the Github GraphlQL API (via octokit/graphql) for at least some if not all calls, to optimize some of this.
For example:
getPullRequestStatus - makes 2 Github calls - 1 call to get PR + 1 call to get PR status. I think using GraphQL we can get this down to 1 call, which would shave off ~50% of network time for that command, for each repo.
getCandidateRepos (org query) - makes 1 call but the data returned is fairly large. I tried it on my own org and got back 423 repos at 2.1MB. I then filtered that data for just full_name and archived fields and the resulting file size is 24KB. So we're retrieving almost 90x the data actually needed and just discarding the rest! This is the case for a lot of the Github calls.
I'd be happy to work on this but wanted to get an idea of the level of interest before I make a PR, bc it's a decent amount of work.
Here's the plan I had in mind:
Phase 1: split out all octokit calls into a new Github service
Phase 2: identify calls with most room for improvement and switch to GraphQL, rinse and repeat as needed
Phase 1 isn't strictly necessary, but I think it will help clean the code up a bit, aid in reuse, and also give us the flexibility to mix-and-match REST & GraphQL as necessary, among other benefits.
Batch
Work
Target Date
Status
1
Github service
2101 B
✅
2
getCandidateRepos
2102 B
ToDo
The text was updated successfully, but these errors were encountered:
Strategy makes sense and I am definitely interested in optimizing calls and data transfer. Before starting work on this, can you add a plan for delivering the update across all calls? For example:
Batch
APIs
Target Date
1
getPullRequestStatus, getCandidateRepos
2101 B
2
xxx, yy
2102 A
...
...
...
Where 2101 B stands for the second half of January 2021, 2102 A stands for the first half of February 2021, and so on.
Strategy makes sense and I am definitely interested in optimizing calls and data transfer. Before starting work on this, can you add a plan for delivering the update across all calls? For example:
Batch APIs Target Date
1 getPullRequestStatus, getCandidateRepos 2101 B
2 xxx, yy 2102 A
... ... ...
Where 2101 B stands for the second half of January 2021, 2102 A stands for the first half of February 2021, and so on.
This is a tough question to answer as it depends on a few major unknowns:
how straightforward it is to get parity on data returned via GraphQL vs REST
how long it takes to migrate a call over
my free time 😁
That being said, I have more confidence in phase 1 (Github service). For migrating actual functions, I'll have a better idea after porting over at least 1 function.
Here's what I can say for now, given the unknowns:
I noticed that
shepherd
is doing a bunch of Github calls and getting a bunch of data that's mostly just discarded. Wanted to float the idea of switching to the Github GraphlQL API (via octokit/graphql) for at least some if not all calls, to optimize some of this.For example:
full_name
andarchived
fields and the resulting file size is 24KB. So we're retrieving almost 90x the data actually needed and just discarding the rest! This is the case for a lot of the Github calls.I'd be happy to work on this but wanted to get an idea of the level of interest before I make a PR, bc it's a decent amount of work.
Here's the plan I had in mind:
Phase 1: split out all octokit calls into a new Github service
Phase 2: identify calls with most room for improvement and switch to GraphQL, rinse and repeat as needed
Phase 1 isn't strictly necessary, but I think it will help clean the code up a bit, aid in reuse, and also give us the flexibility to mix-and-match REST & GraphQL as necessary, among other benefits.
The text was updated successfully, but these errors were encountered: