Skip to content

Commit

Permalink
Update CodeBuild images to Linux2 standard5.0 (node16 to node18) + Up…
Browse files Browse the repository at this point in the history
…date Docker images to use AmazonLinux:2023 (node18 and Python3.9) (data-dot-all#889)

### Feature or Bugfix
- Bugfix

### Detail
The purpose of this PR is to upgrade any compute resource that uses
node16 to node18.

- CodeBuild images: [Amazon Linux 2 x86_64 standard:4.0 use node16
](https://docs.aws.amazon.com/codebuild/latest/userguide/available-runtimes.html)which
is already deprecated. In this PR we update the CodeBuild images to use
Amazon Linux 2 x86_64 standard:5.0 instead
- Docker images: In this PR we replace AmazonLinux2 images by
[AmazonLinux2023](https://docs.aws.amazon.com/linux/al2023/ug/what-is-amazon-linux.html),
the next generation of Amazon Linux from Amazon Web Services. In
AmazonLinux2023 the default Python version installed is 3.9. For this
reason we also upgrade the Python version in this PR.

### Relates
data-dot-all#782 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/). N/A

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
  • Loading branch information
dlpzx authored Dec 7, 2023
1 parent 2e0fd39 commit 5061ecb
Show file tree
Hide file tree
Showing 10 changed files with 125 additions and 94 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -264,7 +264,7 @@ def __init__(self, scope, id, target_uri: str = None, **kwargs):
id=f'{pipeline.name}-build-{env.stage}',
environment=codebuild.BuildEnvironment(
privileged=True,
build_image=codebuild.LinuxBuildImage.AMAZON_LINUX_2_3,
build_image=codebuild.LinuxBuildImage.AMAZON_LINUX_2_5,
environment_variables=PipelineStack.make_environment_variables(
pipeline=pipeline,
pipeline_environment=env,
Expand Down Expand Up @@ -335,7 +335,7 @@ def __init__(self, scope, id, target_uri: str = None, **kwargs):
id=f'{pipeline.name}-build-{env.stage}',
environment=codebuild.BuildEnvironment(
privileged=True,
build_image=codebuild.LinuxBuildImage.AMAZON_LINUX_2_3,
build_image=codebuild.LinuxBuildImage.AMAZON_LINUX_2_5,
environment_variables=PipelineStack.make_environment_variables(
pipeline=pipeline,
pipeline_environment=env,
Expand Down
43 changes: 27 additions & 16 deletions backend/docker/dev/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,24 +1,30 @@
FROM public.ecr.aws/amazonlinux/amazonlinux:2
FROM public.ecr.aws/amazonlinux/amazonlinux:2023

ARG NODE_VERSION=16
ARG NODE_VERSION=18
ARG NVM_VERSION=v0.37.2
ARG PYTHON_VERSION=python3.8
ARG PYTHON_VERSION=python3.9

RUN yum clean all
RUN yum -y install shadow-utils wget
RUN yum -y install openssl-devel bzip2-devel libffi-devel postgresql-devel gcc unzip tar gzip
RUN amazon-linux-extras install $PYTHON_VERSION
RUN yum -y install python38-devel
RUN yum -y install git
# Clean cache
RUN dnf clean all

RUN /bin/bash -c "ln -s /usr/bin/${PYTHON_VERSION} /usr/bin/python3"
# Installing libraries
RUN dnf -y install -y \
shadow-utils wget openssl-devel bzip2-devel libffi-devel \
postgresql-devel gcc unzip tar gzip

# Install Python
RUN dnf install $PYTHON_VERSION
RUN dnf -y install python3-pip python3-devel git

RUN useradd -m app

## Add source
WORKDIR /build

# Configuring path
RUN touch ~/.bashrc

# Install AWS CLI
RUN curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
RUN unzip awscliv2.zip
RUN ./aws/install
Expand All @@ -27,9 +33,11 @@ COPY ./docker/dev/wait-for-it.sh /build/wait-for-it.sh
RUN chmod +x /build/wait-for-it.sh
RUN chown -R app:root /build/wait-for-it.sh

## Add source
WORKDIR /dataall
RUN touch ~/.bashrc

# Configuring Node and CDK
RUN curl -o- https://raw.githubusercontent.com/creationix/nvm/$NVM_VERSION/install.sh | bash
RUN /bin/bash -c ". ~/.nvm/nvm.sh && \
nvm install $NODE_VERSION && nvm use $NODE_VERSION && \
Expand All @@ -46,17 +54,20 @@ $PATH" >> ~/.bashrc && \

RUN /bin/bash -c ". ~/.nvm/nvm.sh && cdk --version"

COPY ./requirements.txt dh.requirements.txt
# App specific requirements
COPY ./requirements.txt requirements.txt
COPY ./dataall/base/cdkproxy/requirements.txt cdk.requirements.txt
COPY ./dataall /dataall

# Install App requirements
RUN /bin/bash -c "${PYTHON_VERSION} -m pip install setuptools"
RUN /bin/bash -c "${PYTHON_VERSION} -m pip install -r requirements.txt"
RUN /bin/bash -c "${PYTHON_VERSION} -m pip install -r cdk.requirements.txt"

# App code
COPY ./dataall /dataall
ADD ./cdkproxymain.py /cdkproxymain.py
ADD ./local_graphql_server.py /local_graphql_server.py

RUN /bin/bash -c "${PYTHON_VERSION} -m pip install -U pip "
RUN /bin/bash -c "${PYTHON_VERSION} -m pip install -r dh.requirements.txt"
RUN /bin/bash -c "${PYTHON_VERSION} -m pip install -r cdk.requirements.txt"

WORKDIR /

ENTRYPOINT [ "/bin/bash", "-c", ". ~/.nvm/nvm.sh && uvicorn cdkproxymain:app --host 0.0.0.0 --port 8080" ]
54 changes: 31 additions & 23 deletions backend/docker/prod/ecs/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,24 +1,28 @@
FROM public.ecr.aws/amazonlinux/amazonlinux:2
FROM public.ecr.aws/amazonlinux/amazonlinux:2023

ARG NODE_VERSION=16
ARG NODE_VERSION=18
ARG NVM_VERSION=v0.37.2
ARG DEEQU_VERSION=2.0.0-spark-3.1
ARG PYTHON_VERSION=python3.8
ARG PYTHON_VERSION=python3.9

# Clean cache
RUN dnf upgrade -y;\
find /var/tmp -name "*.rpm" -print -delete ;\
find /tmp -name "*.rpm" -print -delete ;\
dnf autoremove -y; \
dnf clean all; rm -rfv /var/cache/dnf

# Installing libraries
RUN yum upgrade -y \
&& find /var/tmp -name "*.rpm" -print -delete \
&& find /tmp -name "*.rpm" -print -delete \
&& yum autoremove -y \
&& yum clean all \
&& rm -rfv /var/cache/yum \
&& yum install -y \
RUN dnf -y install \
shadow-utils wget openssl-devel bzip2-devel libffi-devel \
postgresql-devel gcc unzip tar gzip \
&& amazon-linux-extras install $PYTHON_VERSION \
&& yum install -y python38-devel git \
&& /bin/bash -c "ln -s /usr/bin/${PYTHON_VERSION} /usr/bin/python3" \
&& curl https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip -o /tmp/awscliv2.zip \
postgresql-devel gcc unzip tar gzip

# Install Python
RUN dnf install $PYTHON_VERSION
RUN dnf -y install python3-pip python3-devel git

# Install AWS CLI
RUN curl https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip -o /tmp/awscliv2.zip \
&& unzip -q /tmp/awscliv2.zip -d /opt \
&& /opt/aws/install --update -i /usr/local/aws-cli -b /usr/local/bin \
&& rm /tmp/awscliv2.zip \
Expand All @@ -33,31 +37,35 @@ RUN curl -o- https://raw.githubusercontent.com/creationix/nvm/$NVM_VERSION/insta
&& /bin/bash -c ". ~/.nvm/nvm.sh && \
nvm install $NODE_VERSION && nvm use $NODE_VERSION && \
npm install -g aws-cdk && \
nvm alias default node && nvm cache clear" \
&& echo export PATH="\
nvm alias default node && nvm cache clear"

RUN echo export PATH="\
/root/.nvm/versions/node/${NODE_VERSION}/bin:\
$(${PYTHON_VERSION} -m site --user-base)/bin:\
$(python3 -m site --user-base)/bin:\
$PATH" >> ~/.bashrc && \
echo "nvm use ${NODE_VERSION} 1> /dev/null" >> ~/.bashrc \
&& /bin/bash -c ". ~/.nvm/nvm.sh && cdk --version"

RUN $PYTHON_VERSION -m pip install -U pip

# App specific
ADD backend/requirements.txt /dh.requirements.txt
# App specific requirements
ADD backend/requirements.txt /requirements.txt
ADD backend/dataall/base/cdkproxy/requirements.txt /cdk.requirements.txt

RUN /bin/bash -c "pip3.8 install -r /dh.requirements.txt" \
&& /bin/bash -c "pip3.8 install -r /cdk.requirements.txt"
# Install App requirements
RUN /bin/bash -c "${PYTHON_VERSION} -m pip install setuptools"
RUN /bin/bash -c "${PYTHON_VERSION} -m pip install -r requirements.txt"
RUN /bin/bash -c "${PYTHON_VERSION} -m pip install -r cdk.requirements.txt"

# App code
ADD backend/dataall /dataall
VOLUME ["/dataall"]
ADD backend/cdkproxymain.py /cdkproxymain.py

# App configuration file
ENV config_location="/config.json"
COPY config.json /config.json

# Glue profiling jobs jars
RUN mkdir -p dataall/modules/datasets/cdk/assets/glueprofilingjob/jars/
ADD https://repo1.maven.org/maven2/com/amazon/deequ/deequ/$DEEQU_VERSION/deequ-$DEEQU_VERSION.jar /dataall/modules/datasets/cdk/assets/glueprofilingjob/jars/

Expand Down
36 changes: 23 additions & 13 deletions backend/docker/prod/lambda/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,34 +1,44 @@
FROM public.ecr.aws/amazonlinux/amazonlinux:2
FROM public.ecr.aws/amazonlinux/amazonlinux:2023

ARG FUNCTION_DIR="/home/app/"
ARG PYTHON_VERSION=python3.8
ARG PYTHON_VERSION=python3.9

RUN yum upgrade -y;\
# Clean cache
RUN dnf upgrade -y;\
find /var/tmp -name "*.rpm" -print -delete ;\
find /tmp -name "*.rpm" -print -delete ;\
yum autoremove -y; \
yum clean packages; yum clean headers; yum clean metadata; yum clean all; rm -rfv /var/cache/yum
dnf autoremove -y; \
dnf clean all; rm -rfv /var/cache/dnf

RUN yum -y install shadow-utils wget
RUN yum -y install openssl-devel bzip2-devel libffi-devel postgresql-devel gcc unzip tar gzip
RUN amazon-linux-extras install $PYTHON_VERSION
RUN yum -y install python38-devel
# Install libraries
RUN dnf -y install \
shadow-utils wget openssl-devel bzip2-devel libffi-devel \
postgresql-devel gcc unzip tar gzip

## Add your source
# Install Python
RUN dnf install $PYTHON_VERSION
RUN dnf -y install python3-pip python3-devel

## Add source
WORKDIR ${FUNCTION_DIR}

# App specific requirements
COPY backend/requirements.txt ./requirements.txt
RUN $PYTHON_VERSION -m pip install -U pip
RUN $PYTHON_VERSION -m pip install -r requirements.txt -t .

# Install App requirements
RUN /bin/bash -c "${PYTHON_VERSION} -m pip install setuptools"
RUN /bin/bash -c "${PYTHON_VERSION} -m pip install -r requirements.txt"

# App code
COPY backend/. ./

# App configuration file
ENV config_location="config.json"
COPY config.json ./config.json

## You must add the Lambda Runtime Interface Client (RIC) for your runtime.
RUN $PYTHON_VERSION -m pip install awslambdaric --target ${FUNCTION_DIR}

# Command can be overwritten by providing a different command in the template directly.
ENTRYPOINT [ "python3.8", "-m", "awslambdaric" ]
ENTRYPOINT [ "python3.9", "-m", "awslambdaric" ]
CMD ["auth_handler.handler"]
14 changes: 7 additions & 7 deletions deploy/stacks/container.py
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ def __init__(
container_definitions=[ecs.CfnTaskDefinition.ContainerDefinitionProperty(
image=cdkproxy_image.image_name,
name=cdkproxy_container_name,
command=['python3.8', '-m', 'dataall.core.stacks.tasks.cdkproxy'],
command=['python3.9', '-m', 'dataall.core.stacks.tasks.cdkproxy'],
environment=[
ecs.CfnTaskDefinition.KeyValuePairProperty(
name="AWS_REGION",
Expand Down Expand Up @@ -156,7 +156,7 @@ def __init__(

stacks_updater, stacks_updater_task_def = self.set_scheduled_task(
cluster=cluster,
command=['python3.8', '-m', 'dataall.core.environment.tasks.env_stacks_updater'],
command=['python3.9', '-m', 'dataall.core.environment.tasks.env_stacks_updater'],
container_id=f'container',
ecr_repository=ecr_repository,
environment=self._create_env('INFO'),
Expand Down Expand Up @@ -213,7 +213,7 @@ def __init__(
def add_catalog_indexer_task(self):
catalog_indexer_task, catalog_indexer_task_def = self.set_scheduled_task(
cluster=self.ecs_cluster,
command=['python3.8', '-m', 'dataall.modules.catalog.tasks.catalog_indexer_task'],
command=['python3.9', '-m', 'dataall.modules.catalog.tasks.catalog_indexer_task'],
container_id=f'container',
ecr_repository=self._ecr_repository,
environment=self._create_env('INFO'),
Expand Down Expand Up @@ -251,7 +251,7 @@ def add_share_management_task(self):
repository=self._ecr_repository, tag=self._cdkproxy_image_tag
),
environment=self._create_env('DEBUG'),
command=['python3.8', '-m', 'dataall.modules.dataset_sharing.tasks.share_manager_task'],
command=['python3.9', '-m', 'dataall.modules.dataset_sharing.tasks.share_manager_task'],
logging=ecs.LogDriver.aws_logs(
stream_prefix='task',
log_group=self.create_log_group(
Expand Down Expand Up @@ -281,7 +281,7 @@ def add_subscription_task(self):
subscriptions_task, subscription_task_def = self.set_scheduled_task(
cluster=self.ecs_cluster,
command=[
'python3.8',
'python3.9',
'-m',
'dataall.modules.datasets.tasks.dataset_subscription_task',
],
Expand All @@ -306,7 +306,7 @@ def add_subscription_task(self):
def add_bucket_policy_updater_task(self):
update_bucket_policies_task, update_bucket_task_def = self.set_scheduled_task(
cluster=self.ecs_cluster,
command=['python3.8', '-m', 'dataall.modules.datasets.tasks.bucket_policy_updater'],
command=['python3.9', '-m', 'dataall.modules.datasets.tasks.bucket_policy_updater'],
container_id=f'container',
ecr_repository=self._ecr_repository,
environment=self._create_env('DEBUG'),
Expand All @@ -328,7 +328,7 @@ def add_bucket_policy_updater_task(self):
def add_sync_dataset_table_task(self):
sync_tables_task, sync_tables_task_def = self.set_scheduled_task(
cluster=self.ecs_cluster,
command=['python3.8', '-m', 'dataall.modules.datasets.tasks.tables_syncer'],
command=['python3.9', '-m', 'dataall.modules.datasets.tasks.tables_syncer'],
container_id=f'container',
ecr_repository=self._ecr_repository,
environment=self._create_env('INFO'),
Expand Down
2 changes: 1 addition & 1 deletion deploy/stacks/dbmigration.py
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,7 @@ def __init__(
id=f'DBMigrationCBProject{envname}',
project_name=f'{resource_prefix}-{envname}-dbmigration',
environment=codebuild.BuildEnvironment(
build_image=codebuild.LinuxBuildImage.AMAZON_LINUX_2_3,
build_image=codebuild.LinuxBuildImage.AMAZON_LINUX_2_5,
),
role=self.build_project_role,
build_spec=codebuild.BuildSpec.from_object(
Expand Down
Loading

0 comments on commit 5061ecb

Please sign in to comment.