You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
characters like ", \n that trigger additional processing
and non-latin characters that require the MSSQL data type to be NVARCHAR in order to be procesed correctly
..the post-bcp UPDATE step erroneously converts the data to VARCHAR and causes data corruption: non-english characters are replaced with question marks.
For example ex.csv:
foo,id
"a""b",1
йцук,2
loads as foo=???? for id=2.
But if you remove the double "", it loads corectly.
(As an aside, I still think that a better solution would be to let the user avoid post-processing altogether utilizing the bcp's ability to specify Row and Field terminators as I suggested a while ago.)
Sling version (sling --version): 1.4.3
Operating System (linux, mac, windows): windows
Replication Configuration:
sling run --src-stream 'file://C:/temp/ex.csv' --tgt-conn DEV --tgt-object 'dbo.ex' \
--mode truncate --tgt-options '{"use_bulk": true}' -d
Log Output (please run command with -d):
```
2025-03-05 14:45:36 INF Sling CLI | https://slingdata.io
2025-03-05 14:45:36 DBG opened "file" connection (conn-file-8Nb)
2025-03-05 14:45:36 DBG Sling version: 1.4.3 (windows amd64)
2025-03-05 14:45:36 DBG type is file-db
2025-03-05 14:45:36 DBG using: {"columns":null,"mode":"truncate","select":null,"transforms":null}
2025-03-05 14:45:36 DBG using source options: {"empty_as_null":true,"header":true,"fields_per_rec":-1,"compression":"auto","null_if":"NULL","datetime_format":"AUTO","skip_blank_lines":false,"max_decimals":-1}
2025-03-05 14:45:36 DBG using target options: {"datetime_format":"auto","file_max_rows":0,"max_decimals":-1,"use_bulk":true,"add_new_columns":true,"adjust_column_type":false,"column_casing":"source"}
2025-03-05 14:45:36 INF connecting to target database (sqlserver)
2025-03-05 14:45:36 DBG opened "sqlserver" connection (conn-sqlserver-ZC7)
2025-03-05 14:45:37 INF reading from source file system (file)
2025-03-05 14:45:37 DBG opened "file" connection (conn-file-7gW)
2025-03-05 14:45:37 DBG reading single datastream from file://C:/temp/ex.csv [format=csv]
2025-03-05 14:45:37 DBG merging csv readers of 1 files [concurrency=3] from file://C:/temp/ex.csv
2025-03-05 14:45:37 DBG processing reader from file://C:/temp/ex.csv
2025-03-05 14:45:37 DBG auto-decoding UTF-16 (LE)
2025-03-05 14:45:37 DBG delimiter auto-detected: ","
2025-03-05 14:45:37 INF writing to target database [mode: truncate]
2025-03-05 14:45:37 DBG IF OBJECT_ID(N'"dbo"."ex_tmp"', N'U') IS NOT NULL DROP TABLE "dbo"."ex_tmp"
2025-03-05 14:45:37 DBG table "dbo"."ex_tmp" dropped
2025-03-05 14:45:37 DBG create table "dbo"."ex_tmp" ("foo" nvarchar(max),
"id" bigint,
"_sling_loaded_at" integer)
2025-03-05 14:45:37 INF created table "dbo"."ex_tmp"
2025-03-05 14:45:37 INF streaming data
2025-03-05 14:45:37 DBG bcp version is 11
2025-03-05 14:45:37 WRN bcp version 11 is old. This may give issues with sling, consider upgrading.
2025-03-05 14:45:37 DBG C:\Program Files\Microsoft SQL Server\110\Tools\Binn\bcp.exe 'dbo.ex_tmp' in 'C:/Users/Nickolay/AppData/Local/Temp/sqlserver.dbo.ex_tmp.1741175137103.5661.csv' -S '****' -d 'Jira DWH' -t ',' -m '1' '-w' -q -b '50000' -F '2' -e 'C:/Users/Nickolay/AppData/Local/Temp/sqlserver.dbo.ex_tmp.1741175137135.X2c.error' -U 'etl' -P '****'
2025-03-05 14:45:37 DBG select count(*) cnt from "dbo"."ex_tmp"
2025-03-05 14:45:37 DBG truncate table "dbo"."ex"
2025-03-05 14:45:37 INF truncated table "dbo"."ex"
2025-03-05 14:45:37 DBG inserting id [bigint] into id [int]
2025-03-05 14:45:37 DBG inserting _sling_loaded_at [int] into _sling_loaded_at [bigint]
2025-03-05 14:45:37 DBG insert into "dbo"."ex" ("foo", "id", "_sling_loaded_at") select "foo" as "foo", "id" as "id", "_sling_loaded_at" as "_sling_loaded_at" from "dbo"."ex_tmp"
2025-03-05 14:45:37 DBG inserted rows into "dbo"."ex" from temp table "dbo"."ex_tmp"
2025-03-05 14:45:37 INF inserted 2 rows into "dbo"."ex" in 0 secs [6 r/s]
2025-03-05 14:45:37 DBG IF OBJECT_ID(N'"dbo"."ex_tmp"', N'U') IS NOT NULL DROP TABLE "dbo"."ex_tmp"
2025-03-05 14:45:37 DBG table "dbo"."ex_tmp" dropped
2025-03-05 14:45:37 DBG closed "sqlserver" connection (conn-sqlserver-ZC7)
2025-03-05 14:45:37 INF execution succeeded
```
The text was updated successfully, but these errors were encountered:
Thanks for reporting @nickolay , I had briefly looked into the different terminators and encountered some issues, don't remember though.
Can you share what terminator you would use? Actually, can you share the BCP command you would use with the flags?
Issue Description
When a character column contains both
"
,\n
that trigger additional processing..the post-bcp UPDATE step erroneously converts the data to
VARCHAR
and causes data corruption: non-english characters are replaced with question marks.For example
ex.csv
:loads as
foo=????
for id=2.But if you remove the double
""
, it loads corectly.(As an aside, I still think that a better solution would be to let the user avoid post-processing altogether utilizing the bcp's ability to specify Row and Field terminators as I suggested a while ago.)
Sling version (
sling --version
): 1.4.3Operating System (
linux
,mac
,windows
): windowsReplication Configuration:
-d
):The text was updated successfully, but these errors were encountered: