Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replication Trino to Trino - Do not dump value (value = null) for fields that are varchar in the first row. #373

Open
RaulBSC opened this issue Sep 6, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@RaulBSC
Copy link

RaulBSC commented Sep 6, 2024

Issue Description

  • Replication Trino to Trino - Do not dump value (value = null) for fields that are varchar in the first row:
    Do not dump null for fields that are varchar in the first row.
    I have tried with several modes full-refresh,truncate,snapshot and with several dimensions of many few fields and it always happens the same with the fields that are varchar type, it inserts them to null in the first row it processes.

  • Sling version 1.2.18:

  • Operating System linux:

  • Replication Configuration:

source: TRINO_BATCH
target: TRINO_BATCH

defaults:
  mode: full-refresh

streams:
  dm_skm_gold.dim_datetime_30min:
    object: iceberg_xfs.dm_skm_gold.dim_dt_30_2

  • Log Output (please run command with -d):

2024-09-06 15:03:07 INF Sling Replication [1 streams] | TRINO_BATCH -> TRINO_BATCH

2024-09-06 15:03:07 INF [1 / 1] running stream dm_skm_gold.dim_datetime_30min
2024-09-06 15:03:07 DBG Sling version: 1.2.19.dev (2024-09-05) (linux amd64)
2024-09-06 15:03:07 DBG type is db-db
2024-09-06 15:03:07 DBG using: {"columns":null,"mode":"full-refresh","transforms":null}
2024-09-06 15:03:07 DBG using source options: {"empty_as_null":false,"null_if":"NULL","datetime_format":"AUTO","max_decimals":-1}
2024-09-06 15:03:07 DBG using target options: {"datetime_format":"auto","file_max_rows":0,"max_decimals":-1,"use_bulk":true,"add_new_columns":true,"adjust_column_type":false,"column_casing":"source"}
2024-09-06 15:03:07 DBG opened "trino" connection (conn-trino-6I7)
2024-09-06 15:03:07 DBG opened "trino" connection (conn-trino-css)
2024-09-06 15:03:07 INF connecting to source database (trino)
2024-09-06 15:03:07 INF connecting to target database (trino)
2024-09-06 15:03:08 INF reading from source database
2024-09-06 15:03:08 DBG select * from "dm_skm_gold"."dim_datetime_30min"
2024-09-06 15:03:09 INF writing to target database [mode: full-refresh]
2024-09-06 15:03:09 DBG drop table if exists "dm_skm_gold"."dim_dt_30_2_tmp"
2024-09-06 15:03:09 DBG table "dm_skm_gold"."dim_dt_30_2_tmp" dropped
2024-09-06 15:03:10 DBG create table if not exists "dm_skm_gold"."dim_dt_30_2_tmp" ("date_time_id" integer,
"date_time_day_id" integer,
"yyyymmdd" timestamp,
"yyyymmdd_30" timestamp,
"day_num_id" integer,
"subhr_id" integer,
"subhr" varchar,
"year_id" integer,
"month_id" bigint,
"day_id" integer,
"iso_week_id" integer,
"month_name" varchar,
"month_name_abr" varchar,
"week_day" varchar,
"week_day_abr" varchar,
"stamp" integer)
2024-09-06 15:03:10 INF streaming data
2024-09-06 15:04:35 DBG select count(*) cnt from "dm_skm_gold"."dim_dt_30_2_tmp"
2024-09-06 15:04:35 DBG drop table if exists "dm_skm_gold"."dim_dt_30_2"
2024-09-06 15:04:35 DBG table "dm_skm_gold"."dim_dt_30_2" dropped
2024-09-06 15:04:36 DBG create table if not exists "dm_skm_gold"."dim_dt_30_2" ("date_time_id" integer,
"date_time_day_id" integer,
"yyyymmdd" timestamp,
"yyyymmdd_30" timestamp,
"day_num_id" integer,
"subhr_id" integer,
"subhr" varchar,
"year_id" integer,
"month_id" bigint,
"day_id" integer,
"iso_week_id" integer,
"month_name" varchar,
"month_name_abr" varchar,
"week_day" varchar,
"week_day_abr" varchar,
"stamp" integer)
2024-09-06 15:04:36 INF created table "dm_skm_gold"."dim_dt_30_2"
2024-09-06 15:04:37 DBG insert into "dm_skm_gold"."dim_dt_30_2" ("date_time_id", "date_time_day_id", "yyyymmdd", "yyyymmdd_30", "day_num_id", "subhr_id", "subhr", "year_id", "month_id", "day_id", "iso_week_id", "month_name", "month_name_abr", "week_day", "week_day_abr", "stamp") select "date_time_id", "date_time_day_id", "yyyymmdd", "yyyymmdd_30", "day_num_id", "subhr_id", "subhr", "year_id", "month_id", "day_id", "iso_week_id", "month_name", "month_name_abr", "week_day", "week_day_abr", "stamp" from "dm_skm_gold"."dim_dt_30_2_tmp"
2024-09-06 15:04:38 DBG inserted rows into "dm_skm_gold"."dim_dt_30_2" from temp table "dm_skm_gold"."dim_dt_30_2_tmp"
2024-09-06 15:04:38 INF inserted 11999 rows into "dm_skm_gold"."dim_dt_30_2" in 90 secs [133 r/s] [1.7 MB]
2024-09-06 15:04:38 DBG drop table if exists "dm_skm_gold"."dim_dt_30_2_tmp"
2024-09-06 15:04:38 DBG table "dm_skm_gold"."dim_dt_30_2_tmp" dropped
2024-09-06 15:04:38 DBG closed "trino" connection (conn-trino-css)
2024-09-06 15:04:38 INF execution succeeded

2024-09-06 15:04:38 INF Sling Replication Completed in 1m 30s | TRINO_BATCH -> TRINO_BATCH | 1 Successes | 0 Failures

@flarco
Copy link
Collaborator

flarco commented Sep 6, 2024

Never mind, I was confusing it for some other read-only connector. I'll take a look.

@flarco flarco added the bug Something isn't working label Sep 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants