Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(source): deny unknown fields for WITH options #13654

Merged
merged 3 commits into from
Dec 27, 2023
Merged

feat(source): deny unknown fields for WITH options #13654

merged 3 commits into from
Dec 27, 2023

Conversation

xxchan
Copy link
Member

@xxchan xxchan commented Nov 27, 2023

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

related:

I think this is a low-hanging fruit to resolve #9450 (or mitigate). After this change, we are able to reject invalid WITH options during CREATE SOURCE. The error is created by meta when creating the properties from the raw HashMap.

#7034 may allow us to validate the options even earlier in frontend, but I don't think it's super necessary after this. Correct me if I'm wrong.

Code changes:

  • add a unknown_fields: HashMap<String, String> for each source property struct, and add a flag to decide whether to reject unknown fields when creating them (ConnectorProperties::extract).
    • deny when creating new sources. Can search true) when reviewing.
    • accept when recovering old sources
  • move some nested fields to top-level, in order to let serde work better.

Note

All source options are fixed, except cdc and datagen. CDC needs to pass many options directly to debezium. Datagen has very complex config for each data field.

Sinks are not updated, since their code path is different. Leave for next PR.

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added test labels as necessary. See details.
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

@xxchan
Copy link
Member Author

xxchan commented Nov 27, 2023

Wait, is WITH options persisted (and thus this will be a breaking change..)?

@xxchan xxchan marked this pull request as draft November 27, 2023 03:42
@xiangjinwu
Copy link
Contributor

FYI there is an existing effort in handler/create_source.rs to reject unknown fields in WITH and it is problematic: #13362

@xxchan
Copy link
Member Author

xxchan commented Nov 27, 2023

FYI there is an existing effort in handler/create_source.rs to reject unknown fields in WITH and it is problematic: #13362

if !options.is_empty() {
let err_string = format!(
"Get unknown options for {:?} {:?}: {}",
source_schema.format,
source_schema.row_encode,
options
.keys()
.map(|k| k.to_string())
.collect::<Vec<String>>()
.join(","),
);
session.notice_to_user(err_string);
}

Seemingly those check are for format ... encode ... (options), instead of WITH options 🤡

@xiangjinwu
Copy link
Contributor

FYI there is an existing effort in handler/create_source.rs to reject unknown fields in WITH and it is problematic: #13362

Seemingly those check are for format ... encode ... (options), instead of WITH options 🤡

The darker side of the story is, source format ... encode ... is mixed with with options:

let mut options = with_properties.0.clone();
if let CompatibleSourceSchema::V2(source_schema) = source_schema {
options.extend_from_slice(source_schema.row_options());
}
Self::try_from(options.as_slice())

@tabVersion
Copy link
Contributor

FYI there is an existing effort in handler/create_source.rs to reject unknown fields in WITH and it is problematic: #13362

if !options.is_empty() {
let err_string = format!(
"Get unknown options for {:?} {:?}: {}",
source_schema.format,
source_schema.row_encode,
options
.keys()
.map(|k| k.to_string())
.collect::<Vec<String>>()
.join(","),
);
session.notice_to_user(err_string);
}

Seemingly those check are for format ... encode ... (options), instead of WITH options 🤡

Yes, there is some misuse of the two, @Rossil2012 is working on that

@xxchan
Copy link
Member Author

xxchan commented Nov 27, 2023

Anyway, I can't think of any reasons not to add the validation here, since serde is the last step. 🤔

Regardless of where the config come from, it should finally be a valid defined config.

@xxchan
Copy link
Member Author

xxchan commented Nov 27, 2023

Wait, is WITH options persisted (and thus this will be a breaking change..)?

Just tested this. If we apply this, a meta node with unknown fields persisted will fail to start.


Oh, create_source_worker_async is used for startup, while create_source_worker is used for registering new source, so maybe we can treat them differently.

@xxchan xxchan changed the title feat(connector): deny unknown fields for WITH options feat(source): deny unknown fields for WITH options Nov 27, 2023
@xxchan

This comment was marked as resolved.

@xxchan
Copy link
Member Author

xxchan commented Nov 27, 2023

Since the changes in this PR is very simple and non-invasive, maybe we should wait for other ongoing related refactors to be done first. We can come back to this easily.

@BugenZhao BugenZhao self-requested a review November 29, 2023 02:57
@xxchan xxchan added the ci/run-backwards-compat-tests Run backwards compatibility tests in your PR. label Dec 6, 2023
@xxchan xxchan removed ci/run-backwards-compat-tests Run backwards compatibility tests in your PR. ci/run-e2e-iceberg-sink-tests labels Dec 15, 2023
@xxchan xxchan requested a review from a team as a code owner December 25, 2023 12:03
@xxchan xxchan force-pushed the xxchan/deny branch 2 times, most recently from 25d69bf to eabda4b Compare December 25, 2023 12:08
@xxchan
Copy link
Member Author

xxchan commented Dec 25, 2023

Update: since #13414 updated S3 source options, which has some special handling logic. I'm doing some refactoring before merging this PR.

@xxchan xxchan changed the base branch from main to 12-25-refactor_connector_Add_a_dummy_trait_WithOptions_ December 25, 2023 12:13
@xxchan
Copy link
Member Author

xxchan commented Dec 25, 2023

@xxchan xxchan force-pushed the 12-25-refactor_connector_Add_a_dummy_trait_WithOptions_ branch from 307f0da to 5bb84dd Compare December 25, 2023 12:35
@xxchan xxchan force-pushed the xxchan/deny branch 2 times, most recently from f41a1ea to 84d2a91 Compare December 25, 2023 12:41
@xxchan xxchan force-pushed the 12-25-refactor_connector_Add_a_dummy_trait_WithOptions_ branch from 9af8ff0 to da8f4e4 Compare December 26, 2023 05:10
@xxchan xxchan force-pushed the 12-25-refactor_connector_Add_a_dummy_trait_WithOptions_ branch from da8f4e4 to 6173db2 Compare December 27, 2023 03:14
@xxchan xxchan force-pushed the 12-25-refactor_connector_Add_a_dummy_trait_WithOptions_ branch from 6173db2 to e2f5a96 Compare December 27, 2023 15:30
Base automatically changed from 12-25-refactor_connector_Add_a_dummy_trait_WithOptions_ to main December 27, 2023 16:02
commit 25d69bf
Author: xxchan <[email protected]>
Date:   Mon Dec 25 20:01:44 2023 +0800

    fmt

commit e53e7a2
Merge: e49429a 4ad1940
Author: xxchan <[email protected]>
Date:   Mon Dec 25 20:01:00 2023 +0800

    Merge branch 'main' into xxchan/deny

commit e49429a
Author: xxchan <[email protected]>
Date:   Sat Dec 16 00:57:56 2023 +0800

    clippy

commit 60e4117
Merge: 9f976f8 2243a02
Author: xxchan <[email protected]>
Date:   Fri Dec 15 23:13:11 2023 +0800

    Merge branch 'main' into xxchan/deny

commit 9f976f8
Author: xxchan <[email protected]>
Date:   Fri Dec 15 23:12:54 2023 +0800

    fix pulsar

commit 5df6dc5
Author: xxchan <[email protected]>
Date:   Fri Dec 15 17:56:37 2023 +0800

    fix connection.name

commit c2df222
Author: xxchan <[email protected]>
Date:   Fri Dec 15 17:30:05 2023 +0800

    fix test

commit 373a0d5
Merge: ec398f2 dffeedd
Author: xxchan <[email protected]>
Date:   Fri Dec 15 16:44:49 2023 +0800

    Merge branch 'main' into xxchan/deny

commit ec398f2
Author: xxchan <[email protected]>
Date:   Mon Nov 27 16:26:48 2023 +0800

    fix: don't reject invalid options for batch select

commit 025eea8
Merge: 84bde5f 45d2df5
Author: xxchan <[email protected]>
Date:   Mon Nov 27 16:18:17 2023 +0800

    Merge remote-tracking branch 'origin/main' into xxchan/deny

commit 84bde5f
Author: xxchan <[email protected]>
Date:   Mon Nov 27 16:15:52 2023 +0800

    revert deny_unknown_fields on sink.

commit 2e39b51
Author: xxchan <[email protected]>
Date:   Mon Nov 27 16:07:43 2023 +0800

    source: soft-fail with a flag to achive backwards compatibility

commit 22b8911
Author: xxchan <[email protected]>
Date:   Mon Nov 27 11:30:26 2023 +0800

    fmt

commit eede798
Author: xxchan <[email protected]>
Date:   Mon Nov 27 11:30:06 2023 +0800

    add err tests

commit 55bb9e5
Author: xxchan <[email protected]>
Date:   Mon Nov 27 10:58:07 2023 +0800

    except cdc and datagen
@xxchan xxchan enabled auto-merge December 27, 2023 16:25
@xxchan xxchan added this pull request to the merge queue Dec 27, 2023
Merged via the queue into main with commit de6ac8b Dec 27, 2023
11 of 12 checks passed
@xxchan xxchan deleted the xxchan/deny branch December 27, 2023 16:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci/run-backwards-compat-tests Run backwards compatibility tests in your PR. type/feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug: the with clause accepts non-existing option
7 participants