JaneliaSciComp · cgoina · Nov 10, 2023 · Nov 11, 2023 · Nov 11, 2023 · Nov 11, 2023
diff --git a/.env.dev b/.env.dev
@@ -0,0 +1,2 @@
+MAX_PARALLELISM=5000
+MAX_BATCHED_JOBS_ITERATIONS=2
diff --git a/.eslintrc.yml b/.eslintrc.yml
@@ -4,8 +4,10 @@ env:
   es2021: true
 extends:
   - airbnb-base
+parser: '@babel/eslint-parser'
 parserOptions:
   ecmaVersion: 12
+  requireConfigFile: false
 rules:
   no-console: off
   radix: off

diff --git a/README.md b/README.md
@@ -16,14 +16,14 @@ In the diagram below, the code you write is indicated by the blue lambda icons.
 Here's how it works, step-by-step:
 1) You define a **worker** function and a **combiner** function
 2) Launch your burst compute job by calling the **dispatch** function with a range of items to process
-3) The dispatcher will start copies of itself recursively and efficiently start your worker lambdas
-4) Each **worker** is given a range of inputs and must compute results for those inputs and write results to DynamoDB
-5) The Step Function monitors all the results and calls the combiner function when all workers are done
-6) The **combiner** function reads all output from DynamoDB and aggregates them into the final result
+3) The dispatcher partitions the work into batches and starts an AWS Step Function that implements a map-reduce workflow.
+4) The Step Function maps all the batches to workers, where each **worker** is given a range of inputs, computes the results for those inputs and then sends them to another workflow task that persists these results to a DynamoDB table.
+5) When all wotkers complete and their results are persisted to the database, the Step Function invokes the **combiner** that reads all outputs from the DynamoDB table and aggregates them into the final result.
+6) The Step Function also monitors for a timeout and terminates the process if the time configured time limit is reached.
 
 ## Build
 
-You need Node.js 12.x or later in your path, then:
+You need Node.js 20.x or later in your path, then:
 
 ```bash
 npm install
@@ -40,13 +40,16 @@ To deploy to the *dev* stage:
 npm run sls -- deploy
 ```
 
-This will create a application stack named `burst-compute-dev`. 
+This will create an application stack named `janelia-burst-compute-dev`. 
 
-To deploy to a different stage (e.g. "prod"), add a stage argument:
+To deploy to a different stage (e.g. "prod") and a different organization (the default organtization is 'janelia'), add a stage and an org argument:
 ```bash
-npm run sls -- deploy -s prod
+npm run sls -- deploy -s prod --org myorg
 ```
 
+This will create an application stack named
+`myorg-burst-compute-dev`
+
 ## Usage
 
 1. Create **worker** and **combiner** functions which follow the input/output specification defined in the [Interfaces](docs/Interfaces.md) document.

diff --git a/docs/Interfaces.md b/docs/Interfaces.md
@@ -88,7 +88,7 @@ This function will requires dynamodb:Query permission to the **TasksTable**:
     "Action": [
         "dynamodb:Query"
     ],
-    "Resource": "arn:aws:dynamodb:<REGION>:*:table/burst-compute-<STAGE>-tasks",
+    "Resource": "arn:aws:dynamodb:<REGION>:*:table/<ORG>-burst-compute-<STAGE>-tasks",
     "Effect": "Allow"
 },
 ```
@@ -102,12 +102,11 @@ The dispatch function expects the following input object:
 {
   workerFunctionName: "Name or ARN of the user-defined worker Lambda function",
   combinerFunctionName: "Name or ARN of the user-defined combiner Lambda function",
-  startIndex: "Start index to process, inclusive, e.g. 0",
-  endIndex: "End index to process, exclusive, e.g. N if you have N items to process",
+  datasetStartIndex: "Start index to process, inclusive, e.g. 0",
+  datasetEndIndex: "End index to process, exclusive, e.g. N if you have N items to process",
   batchSize: "How many items should each worker instance process",
-  numLevels: "Number of levels in the dispatcher tree, e.g. 1 or 2",
   maxParallelism: "Maximum number of batches to run",
-  searchTimeoutSecs: "Number of seconds to wait for job to finish before ending with a timeout",
+  jobsTimeoutSecs: "Number of seconds to wait for job to finish before ending with a timeout",
   jobParameters: {
     // Any parameters that each worker should receive
   },

diff --git a/docs/burst-compute-diagram.png b/docs/burst-compute-diagram.png
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,2 @@
		MAX_PARALLELISM=5000
		MAX_BATCHED_JOBS_ITERATIONS=2
Copy link Collaborator Author cgoina Feb 26, 2025 Choose a reason for hiding this comment The reason will be displayed to describe this comment to others. Learn more. at this point both values are the same as the defaults so it could be removed - I used this to test it with multiple values to find the best settings.