-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[improvement](jdbc catalog) Optimize JdbcCatalog case mapping stability #41510
Conversation
Thank you for your contribution to Apache Doris. Since 2024-03-18, the Document has been moved to doris-website. |
run buildall |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add need to add more FE unit test to test all these logics
fe/fe-core/src/main/java/org/apache/doris/datasource/mapping/IdentifierMapping.java
Outdated
Show resolved
Hide resolved
fe/fe-core/src/main/java/org/apache/doris/datasource/mapping/DefaultIdentifierMapping.java
Outdated
Show resolved
Hide resolved
fe/fe-core/src/main/java/org/apache/doris/datasource/mapping/DefaultIdentifierMapping.java
Outdated
Show resolved
Hide resolved
fe/fe-core/src/main/java/org/apache/doris/datasource/mapping/DefaultIdentifierMapping.java
Outdated
Show resolved
Hide resolved
fe/fe-core/src/main/java/org/apache/doris/datasource/mapping/DefaultIdentifierMapping.java
Outdated
Show resolved
Hide resolved
fe/fe-core/src/main/java/org/apache/doris/datasource/jdbc/JdbcExternalCatalog.java
Outdated
Show resolved
Hide resolved
fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalCatalog.java
Outdated
Show resolved
Hide resolved
fe/fe-core/src/main/java/org/apache/doris/datasource/mapping/DefaultIdentifierMapping.java
Outdated
Show resolved
Hide resolved
1f71caf
to
a833914
Compare
run buildall |
a833914
to
a310c8f
Compare
run buildall |
a310c8f
to
1b1af81
Compare
run buildall |
4483c21
to
ac6a485
Compare
run buildall |
d440a07
to
313cb95
Compare
72c6f9a
to
9185dca
Compare
fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalCatalog.java
Outdated
Show resolved
Hide resolved
String remoteColumnName = jdbcCatalog.getRemoteColumnNames(this.dbName, this.name, column.getName()); | ||
remoteColumnNames.put(column.getName(), remoteColumnName); | ||
} | ||
if (!remoteColumnNames.isEmpty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the remoteColumnNames
is always not empty
fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalCatalog.java
Show resolved
Hide resolved
dbName -> Optional.ofNullable( | ||
buildDbForInit(dbName, Util.genIdByName(name, dbName), logType, true)), | ||
(key, value, cause) -> value.ifPresent(v -> v.setUnInitialized(invalidCacheInInit))); | ||
ignored -> remoteToLocalPairs.stream().map(Pair::value).collect(Collectors.toList()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not right.
If you do this, the dbnames
cache will always be remoteToLocalPairs
, even if there are new dbs added.
fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalCatalog.java
Show resolved
Hide resolved
@@ -76,6 +79,8 @@ public abstract class ExternalDatabase<T extends ExternalTable> | |||
protected long id; | |||
@SerializedName(value = "name") | |||
protected String name; | |||
@SerializedName(value = "remoteName") | |||
protected String remoteName; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remoteName
maybe null when upgrading from old version. How to handle it?
@@ -85,6 +87,7 @@ public class ExternalTable implements TableIf, Writable, GsonPostProcessable { | |||
protected long dbId; | |||
protected boolean objectCreated; | |||
protected ExternalCatalog catalog; | |||
protected ExternalDatabase db; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should also mark dbName
as deprecated?
@@ -70,6 +70,8 @@ public class ExternalTable implements TableIf, Writable, GsonPostProcessable { | |||
protected long id; | |||
@SerializedName(value = "name") | |||
protected String name; | |||
@SerializedName(value = "remoteName") | |||
protected String remoteName; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remoteName
maybe null when upgrading from old version. How to handle it?
@Override | ||
public boolean tableExist(SessionContext ctx, String dbName, String tblName) { | ||
makeSureInitialized(); | ||
return jdbcClient.isTableExist(dbName, tblName); | ||
ExternalTable tbl = Objects.requireNonNull(this.getDbNullable(dbName)).getTableNullable(tblName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How to make sure tbl
is always not null?
fe/fe-core/src/main/java/org/apache/doris/datasource/mapping/IdentifierMapping.java
Outdated
Show resolved
Hide resolved
@@ -49,6 +49,7 @@ | |||
import com.google.common.collect.Sets; | |||
import com.google.gson.annotations.SerializedName; | |||
import com.google.gson.internal.LinkedTreeMap; | |||
import jdk.internal.joptsimple.internal.Strings; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrong import, use guava
690a196
to
af1edc6
Compare
@@ -781,6 +781,15 @@ protected ExternalDatabase<? extends ExternalTable> buildDbForInit(String remote | |||
return null; | |||
} | |||
} | |||
} catch (RuntimeException e) { | |||
// Handle "Found conflicting" exception explicitly | |||
if (e.getMessage().contains("Found conflicting")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better define "Found conflicting" as a static field.
and use it here and in RuntimeException msg
LOG.error(e.getMessage()); | ||
throw e; // Rethrow to let the caller handle this critical issue | ||
} else { | ||
LOG.warn("Failed to check db {} exist in remote system, ignore it.", localDbName, e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this can be ignored?
Add comment in code
Hi, I found a bug witch can be fixed in this pr, can you add a test case like #45521 in |
af1edc6
to
1e38576
Compare
run buildall |
1 similar comment
run buildall |
TPC-H: Total hot run time: 39494 ms
|
TPC-DS: Total hot run time: 188766 ms
|
ClickBench: Total hot run time: 31.32 s
|
run buildall |
PR approved by at least one committer and no changes requested. |
PR approved by anyone and no changes requested. |
…ty (apache#41510) 1. Cancel the preservation of name mapping by IdentifierMapping, and only use it as the logic of case processing 2. Save the remote name in ExternalDatabase/ExternalTable object for get 3. Use fromRemoteXXName to process the remote name and map it to the local. Currently, only JdbcExternalCatalog uses IdentifierMapping for processing in this method, and other Catalogs are not processed. If processing is required, it should be extended by yourself 4. When lower_case_meta_name is true, the database, table, and column names mapped remotely to local will be checked to see if they are only case-sensitive. 5. Since Doris's column names are case-insensitive, column names will be checked at any time 6. When Fe config's lower_case_table_names is 1 or 2, the table names mapped from remote to local will be checked for only case-sensitive. 7. include/exclude_database_lists The remote name must be used 8. meta_names_mapping is used to resolve the conflicts mentioned above 9. If the same remoteName is in include_database_lists and meta_names_mapping, the filtered remoteName will be displayed according to the name mapped by meta_names_mapping 10. Since IdentifierMapping is no longer used to store the uppercase and lowercase name mapping, ExternalObject itself is used instead, which solves the problem of frequent refresh and small mapping loss 11. Remove the logic of rebuilding JdbcClient every time refreshCatalog
…ty (apache#41510) 1. Cancel the preservation of name mapping by IdentifierMapping, and only use it as the logic of case processing 2. Save the remote name in ExternalDatabase/ExternalTable object for get 3. Use fromRemoteXXName to process the remote name and map it to the local. Currently, only JdbcExternalCatalog uses IdentifierMapping for processing in this method, and other Catalogs are not processed. If processing is required, it should be extended by yourself 4. When lower_case_meta_name is true, the database, table, and column names mapped remotely to local will be checked to see if they are only case-sensitive. 5. Since Doris's column names are case-insensitive, column names will be checked at any time 6. When Fe config's lower_case_table_names is 1 or 2, the table names mapped from remote to local will be checked for only case-sensitive. 7. include/exclude_database_lists The remote name must be used 8. meta_names_mapping is used to resolve the conflicts mentioned above 9. If the same remoteName is in include_database_lists and meta_names_mapping, the filtered remoteName will be displayed according to the name mapped by meta_names_mapping 10. Since IdentifierMapping is no longer used to store the uppercase and lowercase name mapping, ExternalObject itself is used instead, which solves the problem of frequent refresh and small mapping loss 11. Remove the logic of rebuilding JdbcClient every time refreshCatalog
…ty (apache#41510) 1. Cancel the preservation of name mapping by IdentifierMapping, and only use it as the logic of case processing 2. Save the remote name in ExternalDatabase/ExternalTable object for get 3. Use fromRemoteXXName to process the remote name and map it to the local. Currently, only JdbcExternalCatalog uses IdentifierMapping for processing in this method, and other Catalogs are not processed. If processing is required, it should be extended by yourself 4. When lower_case_meta_name is true, the database, table, and column names mapped remotely to local will be checked to see if they are only case-sensitive. 5. Since Doris's column names are case-insensitive, column names will be checked at any time 6. When Fe config's lower_case_table_names is 1 or 2, the table names mapped from remote to local will be checked for only case-sensitive. 7. include/exclude_database_lists The remote name must be used 8. meta_names_mapping is used to resolve the conflicts mentioned above 9. If the same remoteName is in include_database_lists and meta_names_mapping, the filtered remoteName will be displayed according to the name mapped by meta_names_mapping 10. Since IdentifierMapping is no longer used to store the uppercase and lowercase name mapping, ExternalObject itself is used instead, which solves the problem of frequent refresh and small mapping loss 11. Remove the logic of rebuilding JdbcClient every time refreshCatalog
…ty (apache#41510) 1. Cancel the preservation of name mapping by IdentifierMapping, and only use it as the logic of case processing 2. Save the remote name in ExternalDatabase/ExternalTable object for get 3. Use fromRemoteXXName to process the remote name and map it to the local. Currently, only JdbcExternalCatalog uses IdentifierMapping for processing in this method, and other Catalogs are not processed. If processing is required, it should be extended by yourself 4. When lower_case_meta_name is true, the database, table, and column names mapped remotely to local will be checked to see if they are only case-sensitive. 5. Since Doris's column names are case-insensitive, column names will be checked at any time 6. When Fe config's lower_case_table_names is 1 or 2, the table names mapped from remote to local will be checked for only case-sensitive. 7. include/exclude_database_lists The remote name must be used 8. meta_names_mapping is used to resolve the conflicts mentioned above 9. If the same remoteName is in include_database_lists and meta_names_mapping, the filtered remoteName will be displayed according to the name mapped by meta_names_mapping 10. Since IdentifierMapping is no longer used to store the uppercase and lowercase name mapping, ExternalObject itself is used instead, which solves the problem of frequent refresh and small mapping loss 11. Remove the logic of rebuilding JdbcClient every time refreshCatalog
### What problem does this PR solve? Related PR: #41510 Problem Summary: We should use remote name to get statistics
### What problem does this PR solve? Related PR: #41510 Problem Summary: We should use remote name to get statistics
### What problem does this PR solve? Related PR: #41510 Problem Summary: In this pr#41510, namesCache changed from `LoadingCache<String, List<String>>` to `LoadingCache<String, List<Pair<String, String>>>`, but the `remove` method of this cache did not change, causing the hms event case to fail.
### What problem does this PR solve? Related PR: #41510 Problem Summary: In this pr#41510, namesCache changed from `LoadingCache<String, List<String>>` to `LoadingCache<String, List<Pair<String, String>>>`, but the `remove` method of this cache did not change, causing the hms event case to fail.
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)