From d7f45b6c1023701d317896d145783b2ebbb4a7f5 Mon Sep 17 00:00:00 2001 From: rtjd6554 <174791724+rtjd6554@users.noreply.github.com> Date: Wed, 20 Nov 2024 14:15:51 +0000 Subject: [PATCH] Removal of experimental notification and inclusion in functionality. --- README.md | 3 +++ docs/05-ingest.md | 1 - 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 1e0c949908..d711277a85 100644 --- a/README.md +++ b/README.md @@ -59,6 +59,9 @@ stack. - Persistent EMR bulk import stack: Similar to the above stack, but the EMR cluster is persistent, i.e. it never shuts down. This is appropriate if there is a steady stream of import jobs. The cluster can either be of fixed size or it can use EMR managed scaling. +- EMR Serverless Bulk Import stack: Similar to the above 2 stacks in behaviour. This stack is created at Sleeper + instance deployment. It is the default way in which the bulk imports are run and provides benefit by the fact that + when no bulk import jobs are - Dashboard stack: This displays properties of the system in a Cloudwatch dashboard. The following functionality is experimental: diff --git a/docs/05-ingest.md b/docs/05-ingest.md index 3d1023eb91..dec48d9864 100644 --- a/docs/05-ingest.md +++ b/docs/05-ingest.md @@ -144,7 +144,6 @@ There are several stacks that allow data to be imported using the bulk import pr - `EmrServerlessBulkImportStack` - this causes an EMR Serverless application to be created when the Sleeper instance is deployed. This is the default EMR Bulk Import Stack. The advantage of using EMR Serverless is that when there are no bulk import jobs the applications stops with no wasted compute. The startup of the application is greatly reduced compared to standard EMR. - This stack is experimental. - `EmrBulkImportStack` - this causes an EMR cluster to be deployed each time a job is submitted to the EMR bulk import queue. Each job is processed on a separate EMR cluster. The advantage of the cluster being used for one job and then destroyed is that there is no wasted compute if jobs are submitted infrequently. The downside is that there is a