-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
E2E GCS Sink additional test scenarios. #1478
base: develop
Are you sure you want to change the base?
E2E GCS Sink additional test scenarios. #1478
Conversation
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
@@ -265,3 +265,175 @@ Feature: GCS sink - Verification of GCS Sink plugin | |||
Then Open and capture logs | |||
Then Verify the pipeline status is "Succeeded" | |||
Then Verify data is transferred to target GCS bucket | |||
|
|||
#Added new scenarios for GCS Sink - Bijay |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the commented line here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
#Added new scenarios for GCS Sink - Bijay | ||
@BQ_SOURCE_TEST @GCS_SINK_TEST | ||
Scenario:Validate successful records transfer from BigQuery to GCS with macro enabled at sink |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add the macro scenario in a separate feature file with name macro, refer other plugins feature file for naming convention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Then Verify data is transferred to target GCS bucket | ||
|
||
@GCS_SINK_TEST @BQ_SOURCE_TEST | ||
Scenario Outline: To verify data is getting transferred successfully from BigQuery to GCS with contenttype selection |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add the validation for file format as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In progress.
| tsv | text/plain | | ||
|
||
@BQ_SOURCE_TEST @GCS_SINK_TEST | ||
Scenario: To verify data is getting transferred successfully from BigQuery to GCS using advanced file system properties field |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we adding macro here again, It is already covered in macro enabled scenario. It should be for without macro enabled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@BQ_SOURCE_TEST @GCS_SINK_TEST | ||
Scenario: To verify data is getting transferred successfully from BigQuery to GCS using advanced file system properties field | ||
Given Open Datafusion Project to configure pipeline | ||
When Source is BigQuery |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the latest existing steps. This is a common review comment across all scenarios.
Then Close the GCS properties | ||
Then Save the pipeline | ||
Then Preview and run the pipeline | ||
Then Enter runtime argument value "gcsFileSysProperty" for key "FileSystemPr" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any value added in parameter file for file system property
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Then Close the preview | ||
Then Deploy the pipeline | ||
Then Run the Pipeline in Runtime | ||
Then Enter runtime argument value "projectId" for key "gcsProjectId" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the properties from the macro which are already covered in the scenarios . for eg-projectID is already covered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -33,4 +33,5 @@ errorMessageMultipleFileWithoutClearDefaultSchema=Found a row with 4 fields when | |||
errorMessageInvalidSourcePath=Invalid bucket name in path 'abc@'. Bucket name should | |||
errorMessageInvalidDestPath=Invalid bucket name in path 'abc@'. Bucket name should | |||
errorMessageInvalidEncryptionKey=CryptoKeyName.parse: formattedString not in valid format: Parameter "abc@" must be | |||
errorMessageInvalidBucketNameSink=Spark program 'phase-1' failed with error: Errors were encountered during validation. Error code: 400, Unable to read or access GCS bucket. Bucket names must be at least 3 characters in length, got 2: 'gg'. Please check the system logs for more details. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add only relevant error message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Then Verify data is transferred to target GCS bucket | ||
|
||
@GCS_SINK_TEST @BQ_SOURCE_TEST @GCS_Sink_Required | ||
Scenario Outline: To verify successful data transfer from BigQuery to GCS for different formats with write header true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This scenario should be from GCS source to GCS sink right? Re-check and change accordingly. And why are we making it a macro scenarios, it is already covered in macro enabled scenario anyways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@GCS_SINK_TEST @BQ_SOURCE_TEST @GCS_Sink_Required | ||
Scenario Outline: To verify successful data transfer from BigQuery to GCS for different formats with write header true | ||
Given Open Datafusion Project to configure pipeline | ||
When Source is BigQuery |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the latest existing steps from framework. Change in all the scenarios
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Then Wait till pipeline is in running state | ||
Then Open and capture logs | ||
Then Verify the pipeline status is "Succeeded" | ||
Then Verify data is transferred to target GCS bucket |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add the validation steps in all the scenarios.
@@ -65,3 +65,37 @@ Feature: GCS sink - Verify GCS Sink plugin error scenarios | |||
Then Select GCS property format "csv" | |||
Then Click on the Validate button | |||
Then Verify that the Plugin Property: "format" is displaying an in-line error message: "errorMessageInvalidFormat" | |||
|
|||
@GCS_SINK_TEST @BQ_SOURCE_TEST |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change the tag order, for ease of understanding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Then Enter runtime argument value "gcsInvalidBucketNameSink" for key "gcsSinkPath" | ||
Then Run the Pipeline in Runtime with runtime arguments | ||
Then Wait till pipeline is in running state | ||
Then Verify the pipeline status is "Failed" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add open and capture logs step
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Then Verify data is transferred to target GCS bucket | ||
Then Validate the values of records transferred to GCS bucket is equal to the values from source BigQuery table | ||
|
||
@GCS_CSV @GCS_SINK_TEST @GCS_Source_Required @ITN_TEST |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove ITN_TEST tag
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
When Select plugin: "GCS" from the plugins list as: "Sink" | ||
Then Connect plugins: "GCS" and "GCS2" to establish connection | ||
Then Navigate to the properties page of plugin: "GCS" | ||
Then Select dropdown plugin property: "select-schema-actions-dropdown" with option value: "clear" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we using this step?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This step is just clearing the 'output schema' clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean why are we adding this step? It is not required right?
Then Open and capture logs | ||
Then Verify the pipeline status is "Succeeded" | ||
Then Verify data is transferred to target GCS bucket | ||
Then Validate the cmek key "cmekGCS" of target GCS bucket if cmek is enabled |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add the validation step for validating the values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In progress.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| csv | text/csv | | ||
| tsv | text/plain | | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove extra line here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Examples: | ||
| FileFormat | | ||
| csv | | ||
#| tsv | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these commented , Uncomment tsv and delimited
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
caf0a54
to
9c57d24
Compare
Then Validate the values of records transferred to GCS bucket is equal to the values from source BigQuery table | ||
|
||
@GCS_AVRO_FILE @GCS_SINK_TEST @GCS_Source_Required | ||
Scenario Outline: To verify data transferred successfully from GCS Source to GCS Sink with write header true at Sink |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This scenario can be merged with Validate successful records transfer from BigQuery to GCS with advanced file system properties field
& To verify data is getting transferred successfully from BigQuery to GCS with contenttype selection
, why do we need separate scenario for these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Then Verify data is transferred to target GCS bucket | ||
Then Validate the values of records transferred to GCS bucket is equal to the values from source BigQuery table | ||
|
||
@GCS_AVRO_FILE @GCS_SINK_TEST @GCS_Source_Required |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why GCS_Source_Required
tag here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
c4e93b9
to
e8d2829
Compare
0bbe285
to
7d5f5fa
Compare
…EgcsNewChangesSink_BT # Conflicts: # src/e2e-test/features/gcs/sink/GCSSink.feature
No description provided.