Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cdktf: Large terraform.<stack>.tfstate file is cleaned when destroy is cancelled #3269

Open
1 task
MJohnson459 opened this issue Nov 16, 2023 · 4 comments
Open
1 task
Labels
bug Something isn't working enhancement/new-workflow Proposed new workflow priority/backlog Low priority (though possibly still important). Unlikely to be worked on within the next 6 months. ux/cli

Comments

@MJohnson459
Copy link

Expected Behavior

I am running cdktf destroy via a python using subprocess.run. The exact command is

import subprocess
subprocess.run(["cdktf", "destroy", "--auto-approve"], check=True)

When this is destroying instances, pressing ctrl +c should stop the destroy and leave the terraform.<stack>.tfstate file with an accurate list of state.

Actual Behavior

The terraform.<stack>.tfstate file is empty.

Steps to Reproduce

I did a fair bit of testing to try and narrow down what exactly is happening here and get it to the smallest repeatable unit. It seems like the python subprocess call is required for the state to be emptied permanently, however using a raw cdktf destroy I noticed that the state file is briefly emptied and then recreated. This seems dependent on the size of the state.

I suspect what is happening is when the subprocess is cancelled, there isn't enough time given for the cdktf process to recreate the state file. Normally the cdktf process would be given about 10 seconds before the kill is escalated.

  1. Start an instance with a lot of resources. For example many files.
  2. Run cdktf destroy via python
import subprocess
subprocess.run(["cdktf", "destroy", "--auto-approve"], check=True)
  1. When the resources are being destroyed, press ctrl +c.

I can reliably reproduce this with a ~5MB state file.

Versions

$ cdktf debug
language: python
cdktf-cli: 0.18.0
node: v18.12.0
cdktf: 0.18.0
constructs: 10.2.70
jsii: 1.89.0
terraform: 1.6.4
arch: x64
os: linux 5.15.0-88-generic
python: Python 3.9.6
pip: pip 21.1.3 from /home/michael/.pyenv/versions/3.9.6/envs/locus3.9/lib/python3.9/site-packages/pip (python 3.9)
pipenv: null

Providers

┌───────────────┬──────────────────┬─────────┬────────────┬─────────────────────────────┬─────────────────┐
│ Provider Name │ Provider Version │ CDKTF   │ Constraint │ Package Name                │ Package Version │
├───────────────┼──────────────────┼─────────┼────────────┼─────────────────────────────┼─────────────────┤
│ aws           │ 5.19.0           │ ^0.18.0 │            │ cdktf-cdktf-provider-aws    │ 17.0.8          │
├───────────────┼──────────────────┼─────────┼────────────┼─────────────────────────────┼─────────────────┤
│ random        │ 3.5.1            │ ^0.18.0 │            │ cdktf-cdktf-provider-random │ 9.0.0           │
├───────────────┼──────────────────┼─────────┼────────────┼─────────────────────────────┼─────────────────┤
│ tls           │ 4.0.4            │ ^0.18.0 │            │ cdktf-cdktf-provider-tls    │ 8.0.0           │
└───────────────┴──────────────────┴─────────┴────────────┴─────────────────────────────┴─────────────────┘

Gist

No response

Possible Solutions

It might be possible to write the state to a temporary file and then swap instead of emptying the existing state and rewriting it.

Workarounds

No response

Anything Else?

No response

References

No response

Help Wanted

  • I'm interested in contributing a fix myself

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@MJohnson459 MJohnson459 added bug Something isn't working new Un-triaged issue labels Nov 16, 2023
@DanielMSchmidt
Copy link
Contributor

Hey, could you try to run this outside of python through CDKTF CLI and see if the problem persists? We are using the AbortController API under the hood and are forwarding the abort signal to the Terraform CLI. If you abort on CDKTF CLI directly I would assume it takes a few seconds for the abort inside Terraform CLI to be processed. If the python package aborts hard directly or after a certain grace period I think a possibly corrupted state file might be a possibility.

@DanielMSchmidt DanielMSchmidt added priority/backlog Low priority (though possibly still important). Unlikely to be worked on within the next 6 months. ux/cli enhancement/new-workflow Proposed new workflow and removed new Un-triaged issue labels Nov 17, 2023
@MJohnson459
Copy link
Author

Hey. Running the cdktf directly doesn't permanently remove the state however it is still temporarily in a corrupted state. Do you know why the state file is cleaned at all during the process? I had a look through the code but couldn't find where that happens. Is the state file managed by terraform directly or by the CDK?

Given that the updating the state is not being done atomically, it seems likely that other workflows outside the specific python case could also corrupt the state. I'm happy to look into a fix if you can point me in the right direction.

@DanielMSchmidt
Copy link
Contributor

The state file is managed by terraform directly, I think it's most likely somewhere in terraform.

@MJohnson459
Copy link
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement/new-workflow Proposed new workflow priority/backlog Low priority (though possibly still important). Unlikely to be worked on within the next 6 months. ux/cli
Projects
None yet
Development

No branches or pull requests

2 participants