Update burst compute workflow to prepare it for node18 backend #1

cgoina · 2025-02-26T18:51:01Z

The PR includes changes so that we can migrate the burst compute to node18 and newer AWS backends.

Node18 requires aws sdk javascript v3 which no longer supports async lambda invocation so the old framework was replaced with a more complex step function that distributes the work using a Map state.

…te function

…y to combine or combineerror

…isfy the input size constraint of 256K

…, lastBatchId, datasetStartIndex and datasetEndIndex

cgoina · 2025-02-26T18:54:21Z

.env.dev

@@ -0,0 +1,2 @@
+MAX_PARALLELISM=5000
+MAX_BATCHED_JOBS_ITERATIONS=2


at this point both values are the same as the defaults so it could be removed - I used this to test it with multiple values to find the best settings.

cgoina · 2025-02-26T18:56:23Z

serverless.yml

    handler: src/main/nodejs/monitor.monitorHandler
    memorySize: 128
    timeout: ${self:custom.defaultJobTimeoutSecs}
    environment:
      DEBUG: ${self:custom.debug}
-      JOB_TIMEOUT_SECS: ${self:custom.defaultJobTimeoutSecs}
-      TASKS_TABLE_NAME: ${self:custom.tasksTable}
+

 stepFunctions:


This is the biggest and the most important change

cgoina · 2025-02-26T18:57:38Z

serverless.yml

+    environment:
+      DEBUG: ${self:custom.debug}
+
+  cleanupBatch:


I tried an automatic time to live for batch inputs but those files were never cleaned up so I added an explicit task

Cristian Goina added 24 commits November 10, 2023 08:58

a first take to use node18 but invoke async does not work

509cec2

use map to distribute the work

f49de7e

filter parameters to not pass batch results around

82e7ed3

eliminated filter params state

eda5a29

added persist results state in order to persist directly from the sta…

eaa18b0

…te function

removed states that did not seem to be used and from map went directl…

451189e

…y to combine or combineerror

added states to handle partitioning of the batch jobs in order to sat…

64c6e42

…isfy the input size constraint of 256K

specify payload for combine function to exclude batches

a1d0ad4

put monitor back to check for timeout

9e35153

monitor needs all parameters that must propagate such as firstBatchId…

0367b23

…, lastBatchId, datasetStartIndex and datasetEndIndex

use dump batches on s3 and use itemreader

9603af6

introduce an intermediate state to set completed flag

79c7665

specify runtime and architecture in the provider block - switch to arm64

8d8325d

Merge branch 'master' into node18

f22c50a

update some packages

32538bd

set up an org and updated the bundle plugin

f51e981

update runtime to node20x

8dd61b0

set toleratedPercentageFailure

b61d61a

added cleanup step - to clean batch inputs

70674df

changed logging message

932d865

dotenv; additional state to check for errors

038934e

renamed branching factor

1ed9091

updated the to describe the new workflow

b8498e4

update the default interface name

d506298

cgoina commented Feb 26, 2025

View reviewed changes

cgoina requested review from neomorphic and krokicki February 26, 2025 20:16

updated diagram

55578e3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update burst compute workflow to prepare it for node18 backend #1

Update burst compute workflow to prepare it for node18 backend #1

cgoina commented Feb 26, 2025

cgoina Feb 26, 2025

cgoina Feb 26, 2025

cgoina Feb 26, 2025

		@@ -0,0 +1,2 @@
		MAX_PARALLELISM=5000
		MAX_BATCHED_JOBS_ITERATIONS=2

Update burst compute workflow to prepare it for node18 backend #1

Are you sure you want to change the base?

Update burst compute workflow to prepare it for node18 backend #1

Conversation

cgoina commented Feb 26, 2025

cgoina Feb 26, 2025

Choose a reason for hiding this comment

cgoina Feb 26, 2025

Choose a reason for hiding this comment

cgoina Feb 26, 2025

Choose a reason for hiding this comment