Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Structured Syslog to Clickhouse throws CANNOT_PARSE_QUOTED_STRING #19448

Closed
joshbartley opened this issue Dec 21, 2023 · 2 comments
Closed

Structured Syslog to Clickhouse throws CANNOT_PARSE_QUOTED_STRING #19448

joshbartley opened this issue Dec 21, 2023 · 2 comments
Labels
sink: clickhouse Anything `clickhouse` sink related

Comments

@joshbartley
Copy link

A note for the community

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Problem

I'm setting up a syslog source transformed to a structured format that I want to push to a clickhouse sink. I set the schema on clickhouse to all strings which doesn't seem like it helps as it's an error thrown at parsing. I think the Meta object is causing the issue. If I take the JSON out from the emit_syslog call and do a manual call to Clickhouse it throws a Code: 26. DB::ParsingException: Cannot parse JSON string: expected opening quote: (while reading the value of key meta): While executing ParallelParsingBlockInputFormat: (at row 1) . (CANNOT_PARSE_QUOTED_STRING) (version 23.9.2.47608 (official build)) Removing the Meta property from the JSON and it passes with a 200 instead of a 400 over HTTP. I tried to pull a copy of the message direct from vector through a proxy (Fiddler) but couldn't get that to work for some reason.

Configuration

sources:
  testsyslog:
    type: syslog
    address: 0.0.0.0:9000
    mode: tcp 

transforms:
  remap_syslog:
    inputs:
      - "testsyslog"
    type:   "remap"
    source: |
            structured = parse_syslog!(.message)
            . = merge(., structured)
            .procid = to_string(.procid)
            .version = to_string(.version)  
sinks: 
  emit_syslog:
    inputs:
      - "remap_syslog"
    type: "console"
    encoding:
      codec: "json"

  emit_clickhouse:
    type: clickhouse
    auth:
        strategy: "basic"
        user: "************"
        password: "*****************"
    batch:
        max_bytes: 1048576
        max_events: 5000
        timeout_seconds: 600
    inputs:
      - "remap_syslog"
    endpoint: https://******************.aws.clickhouse.cloud:8443
    table: syslogMessages
    skip_unknown_fields: true

Version

vector 0.34.1 (x86_64-pc-windows-msvc 86f1c22 2023-11-16 14:59:10.486846964)

Debug Output

2023-12-21T19:14:54.172038Z DEBUG sink{component_kind="sink" component_id=cd_clickhouse component_type=clickhouse}:request{request_id=3}:http: vector::internal_events::http_client: HTTP response. status=400 Bad Request version=HTTP/1.1 headers={"date": "Thu, 21 Dec 2023 19:14:54 GMT", "connection": "Keep-Alive", "content-type": "text/plain; charset=UTF-8", "x-clickhouse-server-display-name": "clickhouse-cloud", "transfer-encoding": "chunked", "x-clickhouse-query-id": "648e1cf7-9907-47fc-8496-26669d62e235", "x-clickhouse-timezone": "UTC", "x-clickhouse-exception-code": "26", "keep-alive": "timeout=10", "x-clickhouse-summary": "{\"read_rows\":\"0\",\"read_bytes\":\"0\",\"written_rows\":\"0\",\"written_bytes\":\"0\",\"total_rows_to_read\":\"0\",\"result_rows\":\"0\",\"result_bytes\":\"0\",\"elapsed_ns\":\"3289074\"}"} body=[unknown]

Example Data

{"appname":"xxxxxx","facility":"local0","host":"LAPTOP-XXXXXXXXXX","hostname":"LAPTOP-XXXXXXXXXX","message":"Request finished HTTP/2 POST https://localhost:5501/mini-profiler/results application/json 832 - 200 14253 application/json 0.2664ms","meta":{"ConnectionId":"0HN02B3V95686","ContentLength":"14253","ContentType":"application/json","ElapsedMilliseconds":"0.2664","EventId":"{ Id: 2 }","Host":"localhost:5501","HostingRequestFinishedLog":"Request finished HTTP/2 POST https://localhost:5501/mini-profiler/results application/json 832 - 200 14253 application/json 0.2664ms","Method":"POST","Path":"/mini-profiler/results","PathBase":"","Protocol":"HTTP/2","QueryString":"","RequestId":"0HN02B3V95686:00000211","RequestPath":"/mini-profiler/results","Scheme":"https","SourceContext":"Microsoft.AspNetCore.Hosting.Diagnostics","StatusCode":"200"},"msgid":"Microsoft.AspNetCore.Hosting.Dia","procid":67884,"severity":"info","source_ip":"127.0.0.1","source_type":"syslog","timestamp":"2023-12-21T19:14:53.706270Z","version":1}

Additional Context

No response

References

No response

@joshbartley joshbartley added the type: bug A code related bug. label Dec 21, 2023
@dsmith3197
Copy link
Contributor

dsmith3197 commented Dec 22, 2023

This Clickhouse issue may help you. It sounds like meta has a string type in your Clickhouse table but you are trying to insert a map.

You can either (1) update your Clickhouse table to have the correct types or (2) transform the events in Vector to convert the meta field to a string. For the latter, you could use the encode_json VRL function in a remap transform.

@dsmith3197 dsmith3197 added sink: clickhouse Anything `clickhouse` sink related and removed type: bug A code related bug. labels Dec 22, 2023
@joshbartley
Copy link
Author

That did it. I change the Meta column to JSON type with the experimental option and it went right through. Example below that fixed it.

SET allow_experimental_object_type = 1;
CREATE TABLE IF NOT EXISTS  default.syslogJSONMessages (
    appname String,
    facility String,
    host String,
    hostname String,
    meta JSON,
    msgid String, 
    severity String,
    source_ip String,
    source_type String,
    `timestamp` String, 
)
ENGINE = MergeTree()
Primary KEY (`timestamp`,`host`)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sink: clickhouse Anything `clickhouse` sink related
Projects
None yet
Development

No branches or pull requests

2 participants