Skip to content

Releases: data-dot-all/dataall

v1.6.1

25 Jul 10:15
f3baf14
Compare
Choose a tag to compare

What's Changed

⚠️ We strongly recommend you to upgrade to V1.6.2 directly and skip this release. V1.6.2 includes a better implementation of V1.6.1 fixes ⚠️

  • Fix wrong update of externalId for pivotRole by @dlpzx in #591

Manual actions required

ONLY if you are upgrading!
In the first run the CodePipeline will fail in the CDK Synth stage if no additional changes are done:

botocore.exceptions.ClientError: An error occurred (AccessDenied) when calling the AssumeRole operation: User: arn:aws:sts::111111111111:assumed-role/SOME ROLE/... is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::222222222222:role/cdk-hnb659fds-lookup-role-22222222222-eu-west-1

CodeBuild needs additional permissions to assume the IAM role in the CDK Synth stage. Since we cannot update this CodeBuild stage without running it, the permissions need to be added manually.

Upgrading from V1.6.0 to v1.6.1

The role that we need to update is a role named <PREFIX>-<GITBRANCH>-codebuild-baseline-role. It will say it in the error message in the CodeBuild logs

  1. Go to the IAM role (<PREFIX>-<GITBRANCH>-codebuild-baseline-role) and click on Add permissions > Create inline policy
image 2. Update the policy, use the JSON and copy the policy below: image

The policy of the Codebuild execution role need to include the following:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": "sts:AssumeRole",
            "Resource": "arn:aws:iam::*:role/cdk-hnb659fds-lookup-role*"
        }
    ]
}
  1. After the pipeline has successfully run, go back to the IAM role and remove the manually added policy. The policy is now added as part of infrastructure as code.
image

Upgrading from <V1.6.0 to v1.6.1

The error points at a different role some. A role created by CDK that looks like the following in the CodeBuild logs:

botocore.exceptions.ClientError: An error occurred (AccessDenied) when calling the AssumeRole operation: User: arn:aws:sts:::111111111111:assumed-role/dataall-sbx8-cicd-stack-dataallsbx8cdkpipelinePipe-HMXY7D9OX4FM/AWSCodeBuild-30c50765-4529-4d20-99ce-88f82139a82c is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::22222222222:role/cdk-hnb659fds-lookup-role-22222222222-eu-west-1

We find the role and update it as we explained in the "Upgrading from V1.6.0 to v1.6.1" section.
image

Once that is done, retry the CodeBuild Synth stage. In this case you do NOT need to cleanup the manually added policies as this role will be deleted.
Full Changelog: v1.6.0...v1.6.1

v1.6.0

19 Jul 13:27
84c555e
Compare
Choose a tag to compare

⚠️ Read the IMPORTANT section before upgrading ⚠️
⚠️ We strongly recommend you to upgrade to V1.6.2 directly ⚠️

What's Changed

New features

  • Add share reason in share requests by @noah-paige in #498
  • Import KMS key in imported datasets by @dlpzx in #515 and #572. Support for pre-existing imported datasets in #578

Security

  • Fine-grained NACLs for backend VPC creation by @noah-paige in #543 and in #573
  • Implement security response headers in Cloudfront distributions by @nikpodsh in #529
  • Sanitize the string to avoid a connection string injection by @nikpodsh in #532
  • Restrict KMS keys' policies by @noah-paige in #524
  • Limit dataset IAM role permissions by @dlpzx in #497
  • Limit environment IAM roles permissions by @dlpzx in #515
  • Limit pivot role (IAM role) permissions by @dlpzx in #535 --> it will only be automatically applied to dataallPivotRole-cdk . Migrate to auto-created dataallPivotRole-cdk released in V1.4.0 or manually update the dataallPivotRole roles in your environments.
  • Move parameters from Secrets Manager to SSM by @dlpzx in #455
  • Disable profiling results from "secret" and "official" datasets by @dlpzx in #482
  • CDK execution role policy template by @mourya-33 in #562

Bug-fixes

  • Fix deletion of imported Glue database by @dlpzx in #512
  • Removed unused resources and consolidate KMS keys in environment stack by @noah-paige in #524
  • Fix urllib3 dependencies for glue profiling job by @noah-paige in #513
  • Add cookiecutter config and environment variable for datapipelines stacks by @dbalintx in #582
  • v1.6.0 backwards compatibility changes by @dlpzx in #567
  • Add Glue Resource Policy Permissions for cross account share requests by @noah-paige in #579

⚠️ ⚠️ ⚠️ Important ⚠️ ⚠️ ⚠️

Breaking changes

  • ⚠️ IMPORTANT: It is necessary to upgrade to version >V1.5.0 before upgrading to V1.6 to avoid deletion of resources due to the removal of custom resources.
  • ⚠️ IMPORTANT: requires an update of environments and then datasets after upgrading. Either using cdk.json parameter enable_update_dataall_stacks_in_cicd_pipeline, waiting for overnight update stack task, or manually updating first environments and then datasets. If the environment stack is not updated Dataset creation and other functionalities will fail.
  • ⚠️ IMPORTANT: Because of the implementation of #529 the CloudFront distribution will be recreated. This means that the url used in the CloudFront distribution will be new. You can directly use the new url. In case you are using a custom domain with an SSL certificate, before upgrading to v1.6, you should remove the CNAME's (for both frontend and userguide ) from the old distributions as mentioned in #603
  • ⚠️ IMPORTANT: Additional EC2 permissions are needed in the CDK Synth CodeBuild because of the implementation of #543 --> this can be avoided by upgrading to v1.5.6 before upgrading to v1.6.0 or manually adding the necessary permissions and retrying the pipeline run. Check the PR for more details.
  • Developing locally requires using a role ending in -graphql-role, -awsworker-role or ecs-tasks-role to work with the more restrictive pivotRole trust policy implemented in #535.

New Contributors 🚀

Full Changelog: v1.5.6...v1.6.0

v1.5.6

12 Jul 10:40
45c5cfb
Compare
Choose a tag to compare

What's Changed

Bug Fixes

  • Resolve dataset share checks when deleting dataset by @noah-paige in #554

Enhancements

Package updates

New Contributors

Welcome to the project 🎉

Full Changelog: v1.5.5...v1.5.6

v1.5.5

20 Jun 11:25
aa9d3df
Compare
Choose a tag to compare

What's Changed

  • hotfix: dynamic SQL generation by @chamcca in #514
  • dependabot: upgrade fast-xml-parser, aws-amplify, react-scripts, override react-redux to non-vulnerable version by @dlpzx in #521
  • dependabot: resolve nth-check in sub-dependencies by @dlpzx in #525

New Contributors

Full Changelog: v1.5.4...v1.5.5

v1.5.4

06 Jun 11:41
fa45abd
Compare
Choose a tag to compare

What's Changed

  • Update CDK Version to v2.77.0 to fix vulnerability with CDK Pipeline role in CDK Pipelines construct by @gmuslia in #484
  • Safe removal of consumption roles and teams with open share requests by @dlpzx in #485
  • Fix typo that destroys storage locations by @dlpzx in #481

Full Changelog: v1.5.3...v1.5.4

v1.5.3

25 May 11:20
d0ea832
Compare
Choose a tag to compare

What's Changed

The following pull requests solve issues related to node12 being deprecated. This upgrade is necessary for a correct deployment of data.all.

  • Updated CDK Version to fix issue with cdkproxy/ dataset stack creations by @gmuslia in #476
  • update auth-at-edge semantic version to latest 2.1.5 by @dlpzx in #480

Full Changelog: v1.5.2...v1.5.3

v1.5.2

24 May 13:39
3340610
Compare
Choose a tag to compare

What's Changed

IMPORTANT: the following is a security fix. We encourage users using the data.all Pipelines feature with GitHub templates option to upgrade.

  • hotfix: Remove GitHub template option from data.all Pipelines by @dlpzx in #472

Other fixes:

New Contributors

Full Changelog: v1.5.1...v1.5.2

v1.5.1

16 May 13:49
e9ebb08
Compare
Choose a tag to compare

What's Changed

  • Solve deployment bug #433 CloudFront logs does not enable ACL access by @akaitoua in #437

  • Modify docker-compose yaml to read region and default region from env… by @dlpzx in #446

  • Bump flask from 2.0.3 to 2.3.2 in /backend by @dependabot in #439

  • Bump flask from 2.0.3 to 2.3.2 in /backend/dataall/cdkproxy by @dependabot in #438

  • Bump pymdown-extensions from 8.1.1 to 10.0 in /documentation/userguide by @dependabot in #456

New Contributors

Full Changelog: v1.5.0...v1.5.1

v1.5.0

25 Apr 12:15
219553f
Compare
Choose a tag to compare

What's Changed

New features:

Check each PR for a complete description of the feature.

  • Support OpenSearch Serverless by @kukushking in #292
  • Include Pivot Role as part of environment stack (avoiding manual pivot role creation) by @dlpzx in #355
  • Configurable restricted VPC for tooling resources by @dlpzx in #337
  • Better handling of missing default VPCs and added VPC creation for SageMaker domains by @dlpzx in #427

Bug-Fixes:

  • Fix dev Docker images base by @AmrSaber in #387
  • Fix get AWS credentials from environment tab by @dlpzx in #391
  • Added waiting conditions for slow creation of access points in sharing folders @dlpzx in #392
  • Fix shared dbs worksheet list (duplicates) by @noah-paige in #402
  • Fix sharing update of storage location by @dlpzx in #404
  • Backwards compatibility V1.5 fixes and documentation #431

⚠️ ⚠️ ⚠️ Important ⚠️ ⚠️ ⚠️

Breaking changes

Both the environment and the dataset stacks have been updated in this release. We need to update environment stacks BEFORE creating new datasets or updating existing ones in the environment. There are 3 ways of updating your stacks:

  1. Automatically (daily task) - There is an schedule ECS task that updates stacks daily. it has been modified to update environments and then datasets. Until the task is executed environments and datasets won't reflect the latest status of the code and creation of new datasets will fail.
  2. Automatically (add CICD stage) - We have introduced an optional CICD stage that triggers the ECS stack-updater task from the CICD pipeline. You need to set enable_update_dataall_stacks_in_cicd_pipeline to true in the cdk.json file to enable this stage (check #355 for more details). In this case the only downtime will be the time in which the CICD pipeline is running.
  3. Manually - In data.all console, go to the environment window > Stack tab > click on Update. Once it has completed, go to the required dataset window > Stack tab > click on Update.

Migrating to OpenSearch serverless

If you have deployed data.all with Amazon OpenSearch and would like to migrate to Amazon OpenSearch Serverless,
you would need to migrate the indexes to your new cluster. Although data.all currently does not provide an automated
migration tool, it is possible to do so manually using the following approaches:

Migrating from manual pivot roles to automatically created pivot roles

If you already have environments which use a manually created Pivot Role and want to upgrade to automatically create the pivot Roles as part of the environment stack, you just need to add the enable_pivot_role_auto_create parameter to cdk.json and set it to true. While the CICD pipeline is upgrading you will experience downtimes because backend, frontend and environment and dataset stacks are updated in different CodeBuild stages. For this upgrade we recommend you to enable_update_dataall_stacks_in_cicd_pipeline to update the environment and dataset stacks, otherwise you can either wait for the daily task or manually update all stacks as explained above.

Special thanks to @kukushking, @nikpodsh, @noah-paige , @AmrSaber for their contributions!
Full Changelog: v1.4.3...v1.5.0

v1.4.3

28 Mar 14:16
79f0e4c
Compare
Choose a tag to compare

What's Changed

  • Pin alembic version to 'alembic==1.9.4' by @dlpzx in #354
  • Bugfix default value cascade:false by @dlpzx in #363
  • Bump webpack from 5.75.0 to 5.76.1 in /frontend by @dependabot in #371
  • Upgrade sqlalchemy 13.16 -> 1.3.24 and starlette 0.19.1 -> 0.25.0, ariadne 0.13 -> 0.17, fastapi 0.78 -> 0.92 by @dlpzx in #379
  • BUGFIX - Add dependency in dataset stack by @dlpzx in #385

Full Changelog: v1.4.2...v1.4.3