Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stress Test our API #1737

Open
5 tasks
jpellizzari opened this issue Mar 16, 2022 · 6 comments
Open
5 tasks

Stress Test our API #1737

jpellizzari opened this issue Mar 16, 2022 · 6 comments

Comments

@jpellizzari
Copy link
Contributor

jpellizzari commented Mar 16, 2022

We have a "demo" environment in GCP. I think the plan is to have more?

This presents an opportunity to stress test various features of our api, for example:

  • multi-cluster: how many clusters can we handle with our current caching?
  • namespaces: how does our code react in a system with 2k+ namespaces?
  • pagination: does our solution make things better?
  • users + rbac: how many user+namespace+object combinations can we currently support without slowing?
  • etc

Steps:

  • Gather a couple of people on the team to come up with a prioritized list of things we want to see tested
  • Work with whoever owns that environment to set up some scenarios which test those points, as well as a place to record results
  • Run the scenarios (in priority order, if we can't do all at once)
  • Collate the results
  • Create new tickets/spikes/investigation in the case that we need to change any implementation

This ticket can be seen through by multiple people as it runs; if you start it with step one you don't have to finish it.

@Callisto13
Copy link
Contributor

Set up a call with Sam/Robin and some of us to discuss what we want to test here

@Callisto13 Callisto13 changed the title Stress Test Placeholder Stress Test our API Apr 26, 2022
@JamWils
Copy link
Contributor

JamWils commented May 3, 2022

I've used this tool in the past and it was really helpful https://www.artillery.io/

@jpellizzari
Copy link
Contributor Author

I've used this tool in the past and it was really helpful https://www.artillery.io/

I think we would need almost the opposite of this: instead of thousands/millions of requests as a client, we need thousands of clusters+namespaces in the "database" and only a couple of client requests per second.

Or at least that is the scaling dimension that most worries me.

@SamLR
Copy link
Contributor

SamLR commented May 4, 2022

I've used this tool in the past and it was really helpful https://www.artillery.io/

I think we would need almost the opposite of this: instead of thousands/millions of requests as a client, we need thousands of clusters+namespaces in the "database" and only a couple of client requests per second.

Or at least that is the scaling dimension that most worries me.

From a resilience PoV some questions I'd be interested in investigating are:

  • when does the UI/app become unusable? (e.g. if I'm trying to show 10,000 helmrelease resources does it take minutes to respond/is the page just too big)
  • Does use of the app start blocking interactions with kube-api? Can I DOS kube api by querying 10,000 resources and then force timeouts in flux when trying to deploy new resources?
  • Can I trigger resource exhaustion in a node running the app if the cluster can't scale any more?

Most of the high risk scenarios I see involving the app centre around it blocking actions against kube-api during an incident or becoming a hindrance/unusable with a large number of resources to display so it feels some tooling to easily generate large numbers of resources that can be inspected via it would be useful.

this could probably be some sort of bash script which spits out large artificial flux configurations or something fancier with some basic templating/config options.

@Callisto13
Copy link
Contributor

@joshri would also like to know that the UI (graph) still looks pretty with a ton of crap in there

@SamLR
Copy link
Contributor

SamLR commented May 4, 2022

I had a chat with @JamWils about this and I'm going to look at options for building flux bucket-sources that can be set up for a couple of basic scenarios and we can go from there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants