Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additional Steps for BQ_Source #1472

Open
wants to merge 11 commits into
base: develop
Choose a base branch
from

Conversation

Praveena2607
Copy link

@Praveena2607 Praveena2607 commented Dec 11, 2024

Please verify the Scenarios.

Copy link

google-cla bot commented Dec 11, 2024

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@itsankit-google itsankit-google added the build Trigger unit test build label Dec 11, 2024
Copy link
Contributor

@masivesky masivesky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the PR description to reflect details and context about the changes. For eg: #1444 (comment)

Copy link
Contributor

@masivesky masivesky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The BUTs seems to be failing, please take a look and fix the errors.

@Praveena2607
Copy link
Author

Description:

Updated the BigQuery plugin to include additional steps for better handling.
Modified property locators in Cdfplugin property locators.java to support new BigQuery functionality.
Updated the errorMessage.properties and pluginParameters.properties file to ensure proper error handling and parameter management for the new configuration.

Modified Files:

Cdfplugin property locators.java
errorMessage.properties
pluginParameters.properties

File Modified:

src/e2e-test/java/io/cdap/plugin/utils/CdfPluginPropertyLocator.java

Unit Tests:

No changes made to unit tests.
Verified that all new test cases pass successfully.

@masivesky
Copy link
Contributor

Build with unit tests is still failing, there are a bunch of checkstyle errors, for eg:

/tmp/google-cloud/google-cloud/src/e2e-test/java/io/cdap/plugin/utils/E2ETestConstants.java:20:68: '=' is not preceded with whitespace. [WhitespaceAround]

Please check and fix the same

pom.xml Outdated
@@ -86,7 +86,7 @@
<google.cloud.storage.version>2.3.0</google.cloud.storage.version>
<google.cloud.datastore.version>1.105.1</google.cloud.datastore.version>
<google.protobuf.java.version>3.19.4</google.protobuf.java.version>
<google.tink.version>1.3.0-rc3</google.tink.version>
<google.tink.version>1.5.0</google.tink.version>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I encountered dependency errors while running mvn clean install after resolving all the checkstyle issues. The build was failing due to the outdated version of google.tink.version. Updating it to 1.5.0 resolved the issue, and no other changes were made.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please do not make dependency changes in the e2e tests PR.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactor CdfPluginPropertyLocator alignment and update build dependencies.
-Fixed indentation alignment for partitionFrom, partitionTo, filter and Output Schema.
-Updated pom.xml as it is.

Comment on lines 54 to 86
.put("projectId", CdfPluginPropertyLocator.PROJECT_ID)
.put("datasetProjectId", CdfPluginPropertyLocator.DATASET_PROJECT_ID)
.put("dataset", CdfPluginPropertyLocator.DATASET)
.put("table", CdfPluginPropertyLocator.TABLE)
.put("format", CdfPluginPropertyLocator.FORMAT)
.put("path", CdfPluginPropertyLocator.PATH)
.put("sampleSize", CdfPluginPropertyLocator.SAMPLE_SIZE)
.put("delimiter", CdfPluginPropertyLocator.DELIMITER)
.put("skipHeader", CdfPluginPropertyLocator.SKIP_HEADER)
.put("pathSuffix", CdfPluginPropertyLocator.SUFFIX)
.put("encryptionKeyName", CdfPluginPropertyLocator.CMEK_KEY)
.put("serviceAccountType", CdfPluginPropertyLocator.SERVICE_ACCOUNT_TYPE)
.put("serviceAccountFilePath", CdfPluginPropertyLocator.SERVICE_ACCOUNT_PATH)
.put("serviceAccountJSON", CdfPluginPropertyLocator.SERVICE_ACCOUNT_JSON)
.put("truncateTable", CdfPluginPropertyLocator.TRUNCATE_TABLE)
.put("updateTableSchema", CdfPluginPropertyLocator.UPDATE_TABLE_SCHEMA)
.put("topic", CdfPluginPropertyLocator.PUBSUB_TOPIC)
.put("maximumBatchCount", CdfPluginPropertyLocator.PUBSUB_MAXIMUM_BATCH_COUNT)
.put("maximumBatchSize", CdfPluginPropertyLocator.PUBSUB_MAXIMUM_BATCH_SIZE)
.put("publishDelayThreshold", CdfPluginPropertyLocator.PUBSUB_PUBLISH_DELAY_THRESHOLD)
.put("retryTimeout", CdfPluginPropertyLocator.PUBSUB_RETRY_TIMEOUT)
.put("errorThreshold", CdfPluginPropertyLocator.PUBSUB_ERROR_THRESHOLD)
.put("outputSchema", CdfPluginPropertyLocator.OUTPUT_SCHEMA_MACRO_INPUT)
.put("objectsToDelete", CdfPluginPropertyLocator.GCS_DELETE_OBJECTS_TO_DELETE)
.put("objectsToCreate", CdfPluginPropertyLocator.GCS_CREATE_OBJECTS_TO_CREATE)
.put("createFailIfObjectExists", CdfPluginPropertyLocator.GCS_CREATE_FAIL_IF_OBJECT_EXISTS)
.put("gcsMoveSourcePath", CdfPluginPropertyLocator.GCS_MOVE_SOURCE_PATH)
.put("gcsMoveDestinationPath", CdfPluginPropertyLocator.GCS_MOVE_DESTINATION_PATH)
.put("partitionFrom", CdfPluginPropertyLocator.PARTITION_START_DATE)
.put("partitionTo", CdfPluginPropertyLocator.PARTITION_END_DATE)
.put("filter", CdfPluginPropertyLocator.FILTER)
.put("Output Schema-macro-input", CdfPluginPropertyLocator.OUTPUT_SCHEMA)
.build();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was unable to add my steps properly in this field because the alignments were incorrect. To fix this, I copied and pasted the same content and added my steps without making any other changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please fix the code style in your editor.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactor CdfPluginPropertyLocator alignment and update build dependencies.
-Fixed indentation alignment for partitionFrom, partitionTo, filter and Output Schema.
-Updated pom.xml as it is.

Then Replace input plugin property: "table" with value: "bqSourceTable"
Then Click on the Get Schema button
Then Enter BigQuery source properties partitionFrom and partitionTo
Then Validate BigQuery source incorrect property error for Partition Start date "<property>" value "<value>"
Copy link
Contributor

@itsmekumari itsmekumari Dec 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Praveena2607 I don't think the internal review comments have been addressed. Please address all the internal review comments provided earlier in this PR as well. Ref Praveena2607#1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed over call, Please resolve all the discussed points and mentioned in internal PR. To use the latest framework steps in all scenarios, for entering the values in property, for validating inline error messages, for selecting output schema macro.

…cies.

-Fixed indentation alignment for partitionFrom, partitionTo, filter and Output Schema.
-Updated pom.xml for as it is.
Then Validate the cmek key "cmekGCS" of target GCS bucket if cmek is enabled

@CMEK @BQ_SOURCE_TEST @GCS_SINK_TEST
Scenario:Validate successful records transfer from BigQuery to GCS with macro arguments for filter and outputschema
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Output schema macro step is not present in scenario but in title it is written. Make sure if not using in scenario remove from title, and add in the scenario where step is used.

Scenario:Validate successful records transfer from BigQuery to GCS with macro arguments for filter and outputschema
Given Open Datafusion Project to configure pipeline
When Source is BigQuery
When Sink is GCS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the latest steps from framework.

Then Enter BigQuery property "serviceAccountJSON" as macro argument "serviceAccount"
Then Enter BigQuery property "dataset" as macro argument "bqDataset"
Then Enter BigQuery property "table" as macro argument "bqSourceTable"
Then Enter BigQuery source property output schema "outputSchema" as macro argument "bqOutputSchema"
Copy link
Contributor

@itsmekumari itsmekumari Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the existing step discussed earlier, for selecting macro action for output schema.

@@ -260,4 +264,124 @@ public void validateRecordsTransferredToTargetTableIsEqualToNumberOfRecordsFromS
BeforeActions.scenario.write("Number of records transferred from source table to target table:" + count);
Assert.assertEquals(count, countRecordsTarget);
}

@Then("Enter BigQuery source properties partitionFrom and partitionTo")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to add the step definition. Please make use of existing steps. This was discussed may times in our calls.

CdfBigQueryPropertiesActions.enterPartitionEndDate(new SimpleDateFormat("dd-MM-yyyy")
.format(DateUtils.addDays(new Date(), 1)));
}
@Then("Validate BigQuery source incorrect property error for Partition Start date {string} value {string}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to add the step definition. Please make use of existing steps.

Assert.assertEquals(expectedColor, actualColor);
}

@Then("Validate BigQuery source incorrect property error for Partition End date {string} value {string}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here as well.

Assert.assertEquals(expectedColor, actualColor);
}

@Then("Validate BigQuery source incorrect property error for reference name{string} value {string}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here as well.

pom.xml Outdated
@@ -86,7 +86,7 @@
<google.cloud.storage.version>2.3.0</google.cloud.storage.version>
<google.cloud.datastore.version>1.105.1</google.cloud.datastore.version>
<google.protobuf.java.version>3.19.4</google.protobuf.java.version>
<google.tink.version>1.3.0-rc3</google.tink.version>
<google.tink.version>1.3.0</google.tink.version>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is still not reverted to older version

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, It is reverted and please review.

Copy link
Member

@itsankit-google itsankit-google left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please revert back the checkstyle changes in existing steps.

Comment on lines 19 to 43
import io.cdap.e2e.pages.actions.CdfPluginPropertiesActions;
import io.cdap.e2e.pages.actions.CdfStudioActions;
import io.cdap.e2e.pages.locators.CdfBigQueryPropertiesLocators;
import io.cdap.e2e.pages.locators.CdfStudioLocators;
import io.cdap.e2e.utils.BigQueryClient;
import io.cdap.e2e.utils.ConstantsUtil;
import io.cdap.e2e.utils.ElementHelper;
import io.cdap.e2e.utils.PluginPropertyUtils;
import io.cdap.e2e.utils.SeleniumHelper;
import io.cdap.e2e.utils.WaitHelper;
import io.cdap.plugin.common.stepsdesign.TestSetupHooks;
import io.cdap.plugin.utils.CdfPluginPropertyLocator;
import io.cdap.plugin.utils.E2EHelper;
import io.cdap.plugin.utils.E2ETestConstants;
import io.cucumber.java.en.Then;
import org.apache.commons.lang3.StringUtils;
import org.apache.commons.lang3.time.DateUtils;
import org.junit.Assert;
import stepsdesign.BeforeActions;

import java.io.IOException;
import java.text.SimpleDateFormat;
import java.util.Date;
import java.util.Optional;
import java.util.Properties;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason we are adding additional imports here? Don't see them being used in the file anywhere

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for pointing this out. I did not manually add any of these imports. This branch was created by cloning from the master, and it seems these additional imports were automatically included during the branching process. could you please suggest how we should proceed?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have now removed the unnecessary imports that are not in use.

@@ -17,4 +17,8 @@ public class E2ETestConstants {
public static final String ERROR_MSG_BQ_INCORRECT_CHUNKSIZE = "errorMessageIncorrectBQChunkSize";
public static final String ERROR_MSG_BQ_INCORRECT_TEMPORARY_BUCKET = "errorMessageIncorrectBQBucketName";
public static final String ERROR_MSG_BQ_INCORRECT_PROPERTY = "errorMessageIncorrectBQProperty";
public static final String ERROR_MSG_INCORRECT_PARTITIONSTARTDATE = "errorMessageIncorrectPartitionStartDate";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check this change is not required.

Copy link
Contributor

@itsmekumari itsmekumari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Address all the review comments ASAP. And check the build is in green.

… have been committed. The build has been successfully tested locally.
… have been committed. The build has been successfully tested locally.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Trigger unit test build
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants