Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Promise leaks in v3 release #1466

Open
t0yv0 opened this issue Oct 28, 2024 · 12 comments
Open

Promise leaks in v3 release #1466

t0yv0 opened this issue Oct 28, 2024 · 12 comments
Labels
kind/bug Some behavior is incorrect or out of spec needs-repro Needs repro steps before it can be triaged or fixed

Comments

@t0yv0
Copy link
Member

t0yv0 commented Oct 28, 2024

What happened?

User comment:

This is causing "The Pulumi runtime detected that 951 promises were still active at the time that the process exited." for me, with no easy way to diagnose it.

Example

Need a repro.

Output of pulumi about

N/A

Additional context

No response

Contributing

Vote on this issue by adding a 👍 reaction.
To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).

@t0yv0 t0yv0 added kind/bug Some behavior is incorrect or out of spec needs-triage Needs attention from the triage team labels Oct 28, 2024
@t0yv0
Copy link
Member Author

t0yv0 commented Oct 28, 2024

From: #1425 (comment)

@t0yv0
Copy link
Member Author

t0yv0 commented Oct 28, 2024

@ffMathy this is relatively difficult for us to chase down from this description only, do you have any hints on how to reproduce? Apparently also there is this environment variable that may collect more information to help us fix the issue:

 PULUMI_DEBUG_PROMISE_LEAKS=true

@ffMathy
Copy link

ffMathy commented Oct 28, 2024

Yes, sorry. I will provide new input with that given environment variable as soon as possible.

@corymhall corymhall added awaiting-feedback Blocked on input from the author and removed needs-triage Needs attention from the triage team labels Oct 29, 2024
@ffMathy
Copy link

ffMathy commented Nov 14, 2024

Here are the leaks:

Promise leak detected:
    CONTEXT(816): rpcKeepAlive
    STACK_TRACE:
    Error:
        at Object.debuggablePromise (/workspaces/REDACTED/node_modules/@pulumi/runtime/debuggable.ts:84:75)
        at Object.rpcKeepAlive (/workspaces/REDACTED/node_modules/@pulumi/runtime/settings.ts:614:25)
        at Object.registerResource (/workspaces/REDACTED/node_modules/@pulumi/runtime/resource.ts:500:18)
        at new Resource (/workspaces/REDACTED/node_modules/@pulumi/resource.ts:556:13)
        at new ComponentResource (/workspaces/REDACTED/node_modules/@pulumi/resource.ts:1228:9)
        at new NodeGroupSecurityGroup (/workspaces/REDACTED/node_modules/@pulumi/nodeGroupSecurityGroup.ts:67:9)
        at REDACTEDComponent.createKubernetesCluster (/workspaces/REDACTED/src/apps/common-iac/src/REDACTED/index.ts:369:31)
        at new REDACTEDComponent (/workspaces/REDACTED/src/apps/common-iac/src/REDACTED/index.ts:45:32)
        at Object.<anonymous> (/workspaces/REDACTED/src/apps/common-iac/pulumi.ts:18:16)
        at Module._compile (node:internal/modules/cjs/loader:1572:14)
        at Object..js (node:internal/modules/cjs/loader:1709:10)
        at Module.load (node:internal/modules/cjs/loader:1315:32)
        at Function._load (node:internal/modules/cjs/loader:1125:12)
        at TracingChannel.traceSync (node:diagnostics_channel:322:14)
        at wrapModuleLoad (node:internal/modules/cjs/loader:216:24)
        at Module.require (node:internal/modules/cjs/loader:1337:12)
        at require (node:internal/modules/helpers:139:16)
        at Object.<anonymous> (/workspaces/REDACTED/node_modules/@pulumi/cmd/run/run.ts:434:33)
        at Generator.next (<anonymous>)
        at fulfilled (/workspaces/REDACTED/node_modules/@pulumi/pulumi/cmd/run/run.js:18:58)
    Promise leak detected:
    CONTEXT(817): resolveURN(resource:node-group-security-group[eks:index:NodeGroupSecurityGroup])
    STACK_TRACE:
    Error:
        at Object.debuggablePromise (/workspaces/REDACTED/node_modules/@pulumi/runtime/debuggable.ts:84:75)
        at /workspaces/REDACTED/node_modules/@pulumi/runtime/resource.ts:738:13
        at Generator.next (<anonymous>)
        at /workspaces/REDACTED/node_modules/@pulumi/pulumi/runtime/resource.js:21:71
        at new Promise (<anonymous>)
        at __awaiter (/workspaces/REDACTED/node_modules/@pulumi/pulumi/runtime/resource.js:17:12)
        at prepareResource (/workspaces/REDACTED/node_modules/@pulumi/pulumi/runtime/resource.js:489:12)
        at Object.registerResource (/workspaces/REDACTED/node_modules/@pulumi/runtime/resource.ts:503:24)
        at new Resource (/workspaces/REDACTED/node_modules/@pulumi/resource.ts:556:13)
        at new ComponentResource (/workspaces/REDACTED/node_modules/@pulumi/resource.ts:1228:9)
        at new NodeGroupSecurityGroup (/workspaces/REDACTED/node_modules/@pulumi/nodeGroupSecurityGroup.ts:67:9)
        at REDACTEDComponent.createKubernetesCluster (/workspaces/REDACTED/src/apps/common-iac/src/REDACTED/index.ts:369:31)
        at new REDACTEDComponent (/workspaces/REDACTED/src/apps/common-iac/src/REDACTED/index.ts:45:32)
        at Object.<anonymous> (/workspaces/REDACTED/src/apps/common-iac/pulumi.ts:18:16)
        at Module._compile (node:internal/modules/cjs/loader:1572:14)
        at Object..js (node:internal/modules/cjs/loader:1709:10)
        at Module.load (node:internal/modules/cjs/loader:1315:32)
        at Function._load (node:internal/modules/cjs/loader:1125:12)
        at TracingChannel.traceSync (node:diagnostics_channel:322:14)
        at wrapModuleLoad (node:internal/modules/cjs/loader:216:24)
        at Module.require (node:internal/modules/cjs/loader:1337:12)
        at require (node:internal/modules/helpers:139:16)
        at Object.<anonymous> (/workspaces/REDACTED/node_modules/@pulumi/cmd/run/run.ts:434:33)
        at Generator.next (<anonymous>)
        at fulfilled (/workspaces/REDACTED/node_modules/@pulumi/pulumi/cmd/run/run.js:18:58)

There are tons more. Hundreds. But that's too large for GitHub. And it just repeats.

@pulumi-bot pulumi-bot added needs-triage Needs attention from the triage team and removed awaiting-feedback Blocked on input from the author labels Nov 14, 2024
@flostadler
Copy link
Contributor

Thanks @ffMathy!

Could you add your pulumi program, or a minimal repro if you have one, so we can further debug this? It seems like the affected component is eks:index:NodeGroupSecurityGroup, how are you using that one?

I wonder if this is related to this issue here: pulumi/pulumi#13307 (comment)
Are you adding a child to any of the components, outside of the component?

@flostadler flostadler added awaiting-feedback Blocked on input from the author and removed needs-triage Needs attention from the triage team labels Nov 15, 2024
@ffMathy
Copy link

ffMathy commented Nov 15, 2024

I can't create a repro right now unfortunately. The code base is huge and we are quite busy for Q4 work that needs to be finished.

The code I am using is this:

    const nodeSecurityGroup = new eks.NodeGroupSecurityGroup(
      'node-group-security-group',
      {
        clusterSecurityGroup: securityGroup,
        eksCluster: cluster.eksCluster,
        vpcId: vpc.vpcId,
      },
      {
        parent: this,
      },
    );

Do you need more? Perhaps the VPC, cluster and security group too?

@pulumi-bot pulumi-bot added needs-triage Needs attention from the triage team and removed awaiting-feedback Blocked on input from the author labels Nov 15, 2024
@flostadler
Copy link
Contributor

Thanks! If you could also add the VPC, cluster and security group it would be great!
I'll spend some time trying to reproduce it by using those resources.

@flostadler flostadler removed the needs-triage Needs attention from the triage team label Nov 15, 2024
@ffMathy
Copy link

ffMathy commented Nov 15, 2024

More details for everything related to the Node group, security group and cluster.

    const securityGroup = new aws.ec2.SecurityGroup(
      'cluster-security-group',
      {
        namePrefix: 'cluster-security-group-',
        vpcId: vpc.vpcId,
        description: 'REDACTED',
        ingress: [{ protocol: '-1', self: true, fromPort: 0, toPort: 0 }],
        egress: [
          { protocol: '-1', fromPort: 0, toPort: 0, cidrBlocks: ['0.0.0.0/0'] },
        ],
      },
      {
        parent: this,
      },
    );

    const nodeRole = new aws.iam.Role(
      'node-role',
      {
        assumeRolePolicy: JSON.stringify({
          Version: '2012-10-17',
          Statement: [
            {
              Effect: 'Allow',
              Principal: {
                Service: 'ec2.amazonaws.com',
              },
              Action: 'sts:AssumeRole',
            },
          ],
        }),
      },
      {
        parent: this,
      },
    );

    const cluster = new eks.Cluster(
      'cluster',
      {
        vpcId: vpc.vpcId,
        privateSubnetIds: vpc.privateSubnetIds,
        publicSubnetIds: vpc.publicSubnetIds,
        createOidcProvider: true,
        fargate: false,
        skipDefaultNodeGroup: true,
        clusterSecurityGroup: securityGroup,
        roleMappings: [
          {
            roleArn: getAdminRoleArn(),
            groups: ['system:masters'],
            username: 'pulumi:admin-usr',
          },
          {
            roleArn: gitHubActionsRole.arn,
            groups: ['system:masters'],
            username: 'pulumi:admin-usr',
          },
          {
            roleArn: nodeRole.arn,
            username: 'system:node:{{EC2PrivateDNSName}}',
            groups: ['system:bootstrappers', 'system:nodes'],
          },
        ],
      },
      {
        parent: this,
      },
    );

    new aws.iam.RolePolicyAttachment(
      'eks-worker-node-policy',
      {
        policyArn: 'arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy',
        role: nodeRole.name,
      },
      {
        parent: this,
      },
    );
    new aws.iam.RolePolicyAttachment(
      'cni-policy',
      {
        policyArn: 'arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy',
        role: nodeRole.name,
      },
      {
        parent: this,
      },
    );
    new aws.iam.RolePolicyAttachment(
      'ec2-container-registry-read-only-policy',
      {
        policyArn: 'arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly',
        role: nodeRole.name,
      },
      {
        parent: this,
      },
    );

    const nodeSecurityGroup = new eks.NodeGroupSecurityGroup(
      'node-group-security-group',
      {
        clusterSecurityGroup: securityGroup,
        eksCluster: cluster.eksCluster,
        vpcId: vpc.vpcId,
      },
      {
        parent: this,
      },
    );

    const nodeGroup = new aws.eks.NodeGroup(
      'node-group',
      {
        clusterName: cluster.eksCluster.name,
        nodeGroupNamePrefix: 'node-group-',
        nodeRoleArn: nodeRole.arn,
        subnetIds: pulumi
          .all([vpc.privateSubnetIds, vpc.publicSubnetIds])
          .apply(([privateSubnetIds, publicSubnetIds]) =>
            privateSubnetIds.concat(publicSubnetIds),
          ),
        scalingConfig: {
          desiredSize: isEnvironment('production') ? 2 : 1,
          maxSize: isEnvironment('production') ? 4 : 2,
          minSize: isEnvironment('production') ? 1 : 0,
        },
        updateConfig: {
          maxUnavailable: 1,
        },
        instanceTypes: ['t3a.2xlarge'],
      },
      {
        parent: this,
        replaceOnChanges: ['launchTemplate'],
      },
    );

Furthermore, this is the VPC:

new awsx.ec2.Vpc(
      'vpc',
      {
        subnetStrategy: 'Legacy',
        numberOfAvailabilityZones: 2,
        cidrBlock: '10.0.0.0/16',
        enableDnsHostnames: true,
        enableDnsSupport: true,
        tags: {
          Name: 'ftrack-vpc',
        },
        subnetSpecs: [
          {
            type: 'Private',
            tags: {
              // this tag is needed to make AWS Load Balancer Controller work.
              // Read more: https://docs.aws.amazon.com/eks/latest/userguide/aws-load-balancer-controller.html
              'kubernetes.io/role/internal-elb': '1',
            },
          },
          {
            type: 'Public',
            tags: {
              // this tag is needed to make AWS Load Balancer Controller work.
              // Read more: https://docs.aws.amazon.com/eks/latest/userguide/aws-load-balancer-controller.html
              'kubernetes.io/role/elb': '1',
            },
          },
        ],
      },
      {
        parent: this,
        aliases: [
          {
            parent: this,
            name: 'vpc',
          },
        ],
      },
    );

@flostadler flostadler added the needs-triage Needs attention from the triage team label Nov 15, 2024
@corymhall
Copy link
Contributor

I just tried to reproduce this with the provided example and wasn't able to. Need to dig into this some more.

@corymhall corymhall added needs-repro Needs repro steps before it can be triaged or fixed and removed needs-triage Needs attention from the triage team labels Nov 20, 2024
@vladfrangu
Copy link

Hey! I've been having a similar issue with v3 of this module, requiring me to downgrade back to 2.x to be able to deploy updates 😅. What would you need as a minimum repro sample to hopefully debug this issue? Thankfully my setup is super small, so it should be easy to debug (?), should I make a repo / gist for this?

I'll attach the list of leaks for this, hopefully it helps..

leaks.txt

@t0yv0
Copy link
Member Author

t0yv0 commented Feb 20, 2025

A small repro can make a huge difference for us in being able to narrow down on the fix quickly.

@vladfrangu
Copy link

vladfrangu commented Feb 20, 2025

After battling deploys of EKS clusters on my local account, I've reached a reproduction sample thats super short!

import { version } from "./package.json";
import * as aws from "@pulumi/aws";
import * as eks from "@pulumi/eks";
import * as k8s from "@pulumi/kubernetes";
import { envString } from "./src/lib/shared";

console.log(`Hello, world! ${version}`);

const sharedConfigs: eks.ClusterOptions = {
  nodeAssociatePublicIpAddress: true,
  name: `eks-Cluster-test_cluster_${envString}`,
  // Update version as time goes on to avoid extra costs
  version: "1.31",
};

const cluster = new eks.Cluster("test-cluster", {
  instanceType: aws.ec2.InstanceType.C6a_Large,
  minSize: 1,
  maxSize: 2,
  desiredCapacity: 1,
  ...sharedConfigs,
});

const k8sProvider = cluster.provider;

// create DISCORD_BOTS namespace
// Comment this out and see the issue vanishes
// new k8s.core.v1.Namespace(
// 	'discord-bots-ns',
// 	{ metadata: { name: 'discord-bots' } },
// 	{ provider: k8sProvider, parent: cluster },
// );

Whenever I uncomment the code that creates the namespace in k8s, the leaking promises error is immediately shown, but when it's commented out, pulumi just works like dandy!

FWIW it could be my fault, maybe I've missed some change from v2.x to 3.x, but I can't exactly see any... Unless cluster.provider isn't a provider like typings suggest but something else?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Some behavior is incorrect or out of spec needs-repro Needs repro steps before it can be triaged or fixed
Projects
None yet
Development

No branches or pull requests

6 participants