Releases: data-dot-all/dataall
v1.6.1
What's Changed
Manual actions required
ONLY if you are upgrading!
In the first run the CodePipeline will fail in the CDK Synth
stage if no additional changes are done:
botocore.exceptions.ClientError: An error occurred (AccessDenied) when calling the AssumeRole operation: User: arn:aws:sts::111111111111:assumed-role/SOME ROLE/... is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::222222222222:role/cdk-hnb659fds-lookup-role-22222222222-eu-west-1
CodeBuild needs additional permissions to assume the IAM role in the CDK Synth stage. Since we cannot update this CodeBuild stage without running it, the permissions need to be added manually.
Upgrading from V1.6.0 to v1.6.1
The role that we need to update is a role named <PREFIX>-<GITBRANCH>-codebuild-baseline-role
. It will say it in the error message in the CodeBuild logs
- Go to the IAM role (
<PREFIX>-<GITBRANCH>-codebuild-baseline-role
) and click onAdd permissions
>Create inline policy
The policy of the Codebuild execution role need to include the following:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": "sts:AssumeRole",
"Resource": "arn:aws:iam::*:role/cdk-hnb659fds-lookup-role*"
}
]
}
- After the pipeline has successfully run, go back to the IAM role and remove the manually added policy. The policy is now added as part of infrastructure as code.
Upgrading from <V1.6.0 to v1.6.1
The error points at a different role some. A role created by CDK that looks like the following in the CodeBuild logs:
botocore.exceptions.ClientError: An error occurred (AccessDenied) when calling the AssumeRole operation: User: arn:aws:sts:::111111111111:assumed-role/dataall-sbx8-cicd-stack-dataallsbx8cdkpipelinePipe-HMXY7D9OX4FM/AWSCodeBuild-30c50765-4529-4d20-99ce-88f82139a82c is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::22222222222:role/cdk-hnb659fds-lookup-role-22222222222-eu-west-1
We find the role and update it as we explained in the "Upgrading from V1.6.0 to v1.6.1" section.
Once that is done, retry the CodeBuild Synth stage. In this case you do NOT need to cleanup the manually added policies as this role will be deleted.
Full Changelog: v1.6.0...v1.6.1
v1.6.0
What's Changed
New features
- Add share reason in share requests by @noah-paige in #498
- Import KMS key in imported datasets by @dlpzx in #515 and #572. Support for pre-existing imported datasets in #578
Security
- Fine-grained NACLs for backend VPC creation by @noah-paige in #543 and in #573
- Implement security response headers in Cloudfront distributions by @nikpodsh in #529
- Sanitize the string to avoid a connection string injection by @nikpodsh in #532
- Restrict KMS keys' policies by @noah-paige in #524
- Limit dataset IAM role permissions by @dlpzx in #497
- Limit environment IAM roles permissions by @dlpzx in #515
- Limit pivot role (IAM role) permissions by @dlpzx in #535 --> it will only be automatically applied to
dataallPivotRole-cdk
. Migrate to auto-createddataallPivotRole-cdk
released in V1.4.0 or manually update thedataallPivotRole
roles in your environments. - Move parameters from Secrets Manager to SSM by @dlpzx in #455
- Disable profiling results from "secret" and "official" datasets by @dlpzx in #482
- CDK execution role policy template by @mourya-33 in #562
Bug-fixes
- Fix deletion of imported Glue database by @dlpzx in #512
- Removed unused resources and consolidate KMS keys in environment stack by @noah-paige in #524
- Fix urllib3 dependencies for glue profiling job by @noah-paige in #513
- Add cookiecutter config and environment variable for datapipelines stacks by @dbalintx in #582
- v1.6.0 backwards compatibility changes by @dlpzx in #567
- Add Glue Resource Policy Permissions for cross account share requests by @noah-paige in #579
⚠️ ⚠️ ⚠️ Important ⚠️ ⚠️ ⚠️
Breaking changes
⚠️ IMPORTANT: It is necessary to upgrade to version >V1.5.0 before upgrading to V1.6 to avoid deletion of resources due to the removal of custom resources.⚠️ IMPORTANT: requires an update of environments and then datasets after upgrading. Either using cdk.json parameterenable_update_dataall_stacks_in_cicd_pipeline
, waiting for overnight update stack task, or manually updating first environments and then datasets. If the environment stack is not updated Dataset creation and other functionalities will fail.⚠️ IMPORTANT: Because of the implementation of #529 the CloudFront distribution will be recreated. This means that the url used in the CloudFront distribution will be new. You can directly use the new url. In case you are using a custom domain with an SSL certificate, before upgrading to v1.6, you should remove the CNAME's (for both frontend and userguide ) from the old distributions as mentioned in #603⚠️ IMPORTANT: Additional EC2 permissions are needed in the CDK Synth CodeBuild because of the implementation of #543 --> this can be avoided by upgrading to v1.5.6 before upgrading to v1.6.0 or manually adding the necessary permissions and retrying the pipeline run. Check the PR for more details.- Developing locally requires using a role ending in
-graphql-role
,-awsworker-role
orecs-tasks-role
to work with the more restrictive pivotRole trust policy implemented in #535.
New Contributors 🚀
- @mourya-33 made their first contribution in #562
Full Changelog: v1.5.6...v1.6.0
v1.5.6
What's Changed
Bug Fixes
- Resolve dataset share checks when deleting dataset by @noah-paige in #554
Enhancements
- Limiting read-only access to root file systems in ECS by @dbalintx in #523
- Optimized docker image size by @srinivasreddych in #549
- Update import dataset documentation by @marjet26 in #546
- Added ec2:DescribePrefix permissions to CDKSynth by @dlpzx in #566
Package updates
- Bump tough-cookie from 4.1.2 to 4.1.3 in /frontend by @dependabot in #558
- Bump semver from 5.7.1 to 5.7.2 in /frontend by @dependabot in #564
New Contributors
Welcome to the project 🎉
- @marjet26 made their first contribution in #546
- @srinivasreddych made their first contribution in #549
Full Changelog: v1.5.5...v1.5.6
v1.5.5
What's Changed
- hotfix: dynamic SQL generation by @chamcca in #514
- dependabot: upgrade
fast-xml-parser
,aws-amplify
,react-scripts
, overridereact-redux
to non-vulnerable version by @dlpzx in #521 - dependabot: resolve
nth-check
in sub-dependencies by @dlpzx in #525
New Contributors
Full Changelog: v1.5.4...v1.5.5
v1.5.4
What's Changed
- Update CDK Version to v2.77.0 to fix vulnerability with CDK Pipeline role in CDK Pipelines construct by @gmuslia in #484
- Safe removal of consumption roles and teams with open share requests by @dlpzx in #485
- Fix typo that destroys storage locations by @dlpzx in #481
Full Changelog: v1.5.3...v1.5.4
v1.5.3
What's Changed
The following pull requests solve issues related to node12 being deprecated. This upgrade is necessary for a correct deployment of data.all.
- Updated CDK Version to fix issue with cdkproxy/ dataset stack creations by @gmuslia in #476
- update auth-at-edge semantic version to latest 2.1.5 by @dlpzx in #480
Full Changelog: v1.5.2...v1.5.3
v1.5.2
What's Changed
IMPORTANT: the following is a security fix. We encourage users using the data.all Pipelines feature with GitHub templates option to upgrade.
Other fixes:
- fix: Fixes issue with existing cognito callbacks by @gmuslia in #464
- fix: Fix lambda/ECS IAM permissions for AOSS by @kukushking in #467
- fix: Upgrade aurora engine version to 11.16 by @kimengu-david in #471
- fix: 465 - Update Aurora default Parameter Group to 'default.aurora-postgresql11'. by @rbernotas in #466
- fix: Bump requests from 2.27.1 to 2.31.0 in /backend by @dependabot in #469
- fix: Bump requests from 2.27.1 to 2.31.0 in /backend/dataall/cdkproxy by @dependabot in https://github.com/awslabs/aws-dataall
- fix: Bump starlette from 0.25.0 to 0.27.0 and upgrade fastapi by @dlpzx in #460
New Contributors
- @rbernotas made their first contribution in #466
- @kimengu-david made their first contribution in #471
- @kimengu-david made their first contribution in #471
Full Changelog: v1.5.1...v1.5.2
v1.5.1
What's Changed
-
Solve deployment bug #433 CloudFront logs does not enable ACL access by @akaitoua in #437
-
Modify docker-compose yaml to read region and default region from env… by @dlpzx in #446
-
Bump flask from 2.0.3 to 2.3.2 in /backend by @dependabot in #439
-
Bump flask from 2.0.3 to 2.3.2 in /backend/dataall/cdkproxy by @dependabot in #438
-
Bump pymdown-extensions from 8.1.1 to 10.0 in /documentation/userguide by @dependabot in #456
New Contributors
Full Changelog: v1.5.0...v1.5.1
v1.5.0
What's Changed
New features:
Check each PR for a complete description of the feature.
- Support OpenSearch Serverless by @kukushking in #292
- Include Pivot Role as part of environment stack (avoiding manual pivot role creation) by @dlpzx in #355
- Configurable restricted VPC for tooling resources by @dlpzx in #337
- Better handling of missing default VPCs and added VPC creation for SageMaker domains by @dlpzx in #427
Bug-Fixes:
- Fix dev Docker images base by @AmrSaber in #387
- Fix get AWS credentials from environment tab by @dlpzx in #391
- Added waiting conditions for slow creation of access points in sharing folders @dlpzx in #392
- Fix shared dbs worksheet list (duplicates) by @noah-paige in #402
- Fix sharing update of storage location by @dlpzx in #404
- Backwards compatibility V1.5 fixes and documentation #431
⚠️ ⚠️ ⚠️ Important ⚠️ ⚠️ ⚠️
Breaking changes
Both the environment and the dataset stacks have been updated in this release. We need to update environment stacks BEFORE creating new datasets or updating existing ones in the environment. There are 3 ways of updating your stacks:
- Automatically (daily task) - There is an schedule ECS task that updates stacks daily. it has been modified to update environments and then datasets. Until the task is executed environments and datasets won't reflect the latest status of the code and creation of new datasets will fail.
- Automatically (add CICD stage) - We have introduced an optional CICD stage that triggers the ECS stack-updater task from the CICD pipeline. You need to set
enable_update_dataall_stacks_in_cicd_pipeline
totrue
in thecdk.json
file to enable this stage (check #355 for more details). In this case the only downtime will be the time in which the CICD pipeline is running. - Manually - In data.all console, go to the environment window > Stack tab > click on Update. Once it has completed, go to the required dataset window > Stack tab > click on Update.
Migrating to OpenSearch serverless
If you have deployed data.all with Amazon OpenSearch and would like to migrate to Amazon OpenSearch Serverless,
you would need to migrate the indexes to your new cluster. Although data.all currently does not provide an automated
migration tool, it is possible to do so manually using the following approaches:
- Migrate your indexes to Amazon OpenSearch Serverless with Logstash.
- Migrating Amazon OpenSearch Service indexes using remote reindex
Migrating from manual pivot roles to automatically created pivot roles
If you already have environments which use a manually created Pivot Role and want to upgrade to automatically create the pivot Roles as part of the environment stack, you just need to add the enable_pivot_role_auto_create
parameter to cdk.json
and set it to true
. While the CICD pipeline is upgrading you will experience downtimes because backend, frontend and environment and dataset stacks are updated in different CodeBuild stages. For this upgrade we recommend you to enable_update_dataall_stacks_in_cicd_pipeline
to update the environment and dataset stacks, otherwise you can either wait for the daily task or manually update all stacks as explained above.
Special thanks to @kukushking, @nikpodsh, @noah-paige , @AmrSaber for their contributions!
Full Changelog: v1.4.3...v1.5.0
v1.4.3
What's Changed
- Pin alembic version to 'alembic==1.9.4' by @dlpzx in #354
- Bugfix default value cascade:false by @dlpzx in #363
- Bump webpack from 5.75.0 to 5.76.1 in /frontend by @dependabot in #371
- Upgrade sqlalchemy 13.16 -> 1.3.24 and starlette 0.19.1 -> 0.25.0, ariadne 0.13 -> 0.17, fastapi 0.78 -> 0.92 by @dlpzx in #379
- BUGFIX - Add dependency in dataset stack by @dlpzx in #385
Full Changelog: v1.4.2...v1.4.3