Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chaos mesh is unable to start #98

Open
QYuQianchen opened this issue Oct 4, 2024 · 0 comments
Open

Chaos mesh is unable to start #98

QYuQianchen opened this issue Oct 4, 2024 · 0 comments

Comments

@QYuQianchen
Copy link

Description

When running a network bandwidth experiment on an existing network, chaos-mesh fails to start. The returned error is "chaos-mesh is still in a 'starting' state after 10 seconds. "
Not that the same setup (running an existing kurtosis testnet) for memory-stress test works.

Logs

All the pods in namespaces (chaos-mesh, kt-relaytestnet, and kurtosis-engine) running as expected in Kubernetes

NAMESPACE                                          NAME                                                    READY   STATUS    RESTARTS         AGE
chaos-mesh                                         chaos-controller-manager-c47ffc85d-2fcvh                1/1     Running   1 (14m ago)      51m
chaos-mesh                                         chaos-controller-manager-c47ffc85d-s27pd                1/1     Running   0                51m
chaos-mesh                                         chaos-controller-manager-c47ffc85d-vdkfr                1/1     Running   1 (8m47s ago)    51m
chaos-mesh                                         chaos-daemon-lr9nh                                      2/2     Running   0                50m
chaos-mesh                                         chaos-dashboard-766759d4f4-s92sn                        1/1     Running   0                51m
chaos-mesh                                         chaos-dns-server-7c4f7b67c8-n6lfs                       1/1     Running   0                51m

Log printed in console:

Launch attack scenario metaclear-network-bandwidth...
INFO[0000] Loading test suite from /Users/qyu/attacknet/test-suites/metaclear-network-bandwidth.yaml 
INFO[0000] Loading kurtosis network configuration from /Users/qyu/attacknet/network-configs/metaclear-relayer-devnet.yaml 
INFO[0001] Looking for existing enclave identified by namespace kt-relaytestnet 
INFO[0030] An active enclave matching kt-relaytestnet was found 
INFO[0030] Creating a chaos-mesh client                 
INFO[0030] Waiting 50 seconds before starting fault injection 
INFO[0080] Running 1 tests                              
INFO[0080] Running test (1/1): 'metaclear-network-bandwidth' 
INFO[0080] Running test step (1/2): 'inject fault to beacon client' 
ERRO[0090] Error while running test #1                  
FATA[0090] chaos-mesh is still in a 'starting' state after 10 seconds. Check kubernetes events to see what's wrong.
 --- at /Users/qyu/attacknet/pkg/test_executor/executor.go:152 (waitForInjectionCompleted) --- 

Config

YAML config file for attacknet

attacknetConfig:
  grafanaPodName: grafana
  grafanaPodPort: 3000
  waitBeforeInjectionSeconds: 50
  reuseDevnetBetweenRuns: true
  allowPostFaultInspection: true
  existingDevnetNamespace: kt-relaytestnet

harnessConfig:
  networkPackage: github.com/kurtosis-tech/ethereum-package
  networkConfig: metaclear-relayer-devnet.yaml
  networkType: ethereum

testConfig:
  tests:
  - testName: metaclear-network-bandwidth
    health:
      enableChecks: true
      gracePeriod: 2m0s
    planSteps:
    - stepType: injectFault
      description: "inject fault to beacon client"
      chaosFaultSpec:
        kind: NetworkChaos
        apiVersion: chaos-mesh.org/v1alpha1
        spec:
          selector:
            namespaces:
              - kt-relaytestnet
            labelSelectors:
              kurtosistech.com/id: cl-3-lighthouse-geth
          target:
            mode: all
            namespaces:
              - kt-relaytestnet
            labelSelectors:
              kurtosistech.com/id: cl-2-lighthouse-geth
          mode: all
          action: bandwidth
          duration: 1m
          direction: to
          bandwidth:
            rate: '10kbps'
            limit:  20000
            buffer: 500
    - stepType: waitForFaultCompletion
      description: wait for faults to terminate
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant