Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memory bloat when using provider::kubernetes::manifest_decode_multi #2648

Open
chrisseto opened this issue Dec 13, 2024 · 1 comment
Open

Comments

@chrisseto
Copy link

chrisseto commented Dec 13, 2024

Terraform Version, Provider Version and Kubernetes Version

Terraform version: v1.8.5
Kubernetes provider version: v2.30.0
Kubernetes version: N/A

Affected Resource(s)

  • provider::kubernetes::manifest_decode_multi

Terraform Configuration Files

output "test" {
  value = provider::kubernetes::manifest_decode_multi(file("path-to-highly-nested-crd"))
}

Debug Output

N/A

Panic Output

N/A

Steps to Reproduce

  1. Acquire a collection of CRDs with many nested objects all concatenated into a single file with --- separators. The redpanda operator's CRDs are a good example.
  2. Begin monitoring resource utilization
  3. terraform plan

Expected Behavior

I would expect provider::kubernetes::manifest_decode_multi to consume roughly the same amount of resources as using yamldecode.

Actual Behavior

Calling provider::kubernetes::manifest_decode_multi as described will consume ~700MB of RAM before returning the decoded manifests.

Work Around

We found that it's possible to work around this issue by calling yamldecode directly. If your manifests are concatenated into a single file, you can use split("---\n", filecontents) and a comprehension to appropriately decode all the manifests. Be wary of extraneous ---'s and comments as we didn't find a great way to filter those out, though it should be possible with some more effort.

Important Factoids

Oddly enough, the memory bloat appears to come from terraform's SDK encoding the return value of manifest_decode_multi in preparation for returning it to the host terraform process.
Screenshot 2024-12-12 at 17 40 00

My best guess is that this is due to basetypes using value receivers instead of pointer receivers. Though I stopped digging once it become apparent this is more so a systematic issue with the SDK rather than something a simple optimization could mitigate.

We acquired a profile by slapping net/http/pprof into the main of this provider and using the -debug flag to run it. Though the same behavior can be observed by replacing ./internal/framework/provider/functions/testdata/decode_multi.yaml with a sufficiently large CRD and running go test ./internal/framework/provider/functions -run 'TestManifestDecodeMulti' -memprofile mem.out

Due to Kubernetes' lack of support for $ref, it's unfortunately easy for CRDs to become massive. For example, if a PodSpec is accepted in more than one place, the CRD quickly becomes unwieldy.

References

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@chrisseto
Copy link
Author

I should note that the environment we discovered this in was some micro AWS instance with 2 vCPUs and 2Gi of memory. Our TF plan was sufficiently large that it pegged both cores which could have exacerbated the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants