[BUG] Incorrect error message when using INSERT SELECT *, and source table has less columns than target table #3701

felipepessoto · 2024-09-20T21:00:28Z

Bug

Which Delta project/connector is this regarding?

Describe the problem

The error message is misleading: [DELTA_DUPLICATE_COLUMNS_FOUND] Found duplicate column(s) in the data to save: name

Steps to reproduce

DROP TABLE IF EXISTS MySourceTable;
DROP TABLE IF EXISTS MyTargetTable;
CREATE TABLE MySourceTable USING DELTA AS SELECT 1 as Id, 30 as Age, 'John' as Name;
CREATE TABLE MyTargetTable (Id INT, Name STRING) USING DELTA;
INSERT INTO MyTargetTable SELECT * FROM MySourceTable;

Observed results

[DELTA_DUPLICATE_COLUMNS_FOUND] Found duplicate column(s) in the data to save: name
org.apache.spark.sql.delta.schema.SchemaMergingUtils$.checkColumnNameDuplication(SchemaMergingUtils.scala:123)
org.apache.spark.sql.delta.schema.SchemaMergingUtils$.mergeSchemas(SchemaMergingUtils.scala:168)
org.apache.spark.sql.delta.schema.ImplicitMetadataOperation$.mergeSchema(ImplicitMetadataOperation.scala:219)
org.apache.spark.sql.delta.schema.ImplicitMetadataOperation.updateMetadata(ImplicitMetadataOperation.scala:84)
org.apache.spark.sql.delta.schema.ImplicitMetadataOperation.updateMetadata$(ImplicitMetadataOperation.scala:66)
org.apache.spark.sql.delta.commands.WriteIntoDelta.updateMetadata(WriteIntoDelta.scala:77)
org.apache.spark.sql.delta.commands.WriteIntoDelta.writeAndReturnCommitData(WriteIntoDelta.scala:162)
org.apache.spark.sql.delta.commands.WriteIntoDelta.$anonfun$run$1(WriteIntoDelta.scala:106)
org.apache.spark.sql.delta.commands.WriteIntoDelta.$anonfun$run$1$adapted(WriteIntoDelta.scala:101)
org.apache.spark.sql.delta.DeltaLog.withNewTransaction(DeltaLog.scala:227)
org.apache.spark.sql.delta.commands.WriteIntoDelta.run(WriteIntoDelta.scala:101)
org.apache.spark.sql.delta.catalog.WriteIntoDeltaBuilder$$anon$1$$anon$2.insert(DeltaTableV2.scala:432)
org.apache.spark.sql.execution.datasources.v2.SupportsV1Write.writeWithV1(V1FallbackWriters.scala:79)

Expected results

A message saying the data source schema doesn't match the columns of columns

Environment information

Delta Lake version: 3.2
Spark version: 3.5
Scala version: 2.12

Willingness to contribute

The Delta Lake Community encourages bug fix contributions. Would you or another member of your organization be willing to contribute a fix for this bug to the Delta Lake code base?

Yes. I can contribute a fix for this bug independently.
Yes. I would be willing to contribute a fix for this bug with guidance from the Delta Lake community.
No. I cannot contribute a bug fix at this time.

The text was updated successfully, but these errors were encountered:

felipepessoto · 2024-09-20T21:17:46Z

It happens because it expands the * in INSERT INTO MyTargetTable SELECT * FROM MySourceTable into: INSERT INTO MyTargetTable SELECT Id as Id, Age as Name, Name FROM MySourceTable, which makes sense since the second column in the target is Name. I think we need a column length validation first.

Project [Id#724 AS Id#760, cast(Age#725 as string) AS Name#761, Name#726]

felipepessoto added the bug Something isn't working label Sep 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Incorrect error message when using INSERT SELECT *, and source table has less columns than target table #3701

[BUG] Incorrect error message when using INSERT SELECT *, and source table has less columns than target table #3701

felipepessoto commented Sep 20, 2024 •

edited

Loading

felipepessoto commented Sep 20, 2024

[BUG] Incorrect error message when using INSERT SELECT *, and source table has less columns than target table #3701

[BUG] Incorrect error message when using INSERT SELECT *, and source table has less columns than target table #3701

Comments

felipepessoto commented Sep 20, 2024 • edited Loading

Bug

Which Delta project/connector is this regarding?

Describe the problem

Steps to reproduce

Observed results

Expected results

Environment information

Willingness to contribute

felipepessoto commented Sep 20, 2024

felipepessoto commented Sep 20, 2024 •

edited

Loading