Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doesn't properly handle non-ASCII UTF-8 characters in GELF input #54

Open
mikaelstaldal opened this issue Apr 28, 2017 · 0 comments
Open

Comments

@mikaelstaldal
Copy link

  • Version: Logstash 5.3.2
  • Operating System: Ubuntu 16.04
  • Config File:
input { gelf { } }
output {
  elasticsearch { hosts => ["localhost:9200"] }
  stdout { codec => json_lines }
}
  • Sample Data:
{"version":"1.1","host":"udp-zlib","timestamp":1493384075.248,"level":6,"_thread":"main","_logger":"HelloWorld","_additionalField1":"constant value","_additionalField2":"foo bar","_bar":"BAR","_foo":"FOO","short_message":"Hello, world! åäö 1"}

When sending in the above GELF message (compressed with ZLIB, over UDP) to Logstash (using Log4j 2.8.2), the non-ASCII characters in short_message gets garbled. It ends up like this in ElasticSearch:

 {
          "source_host": "127.0.0.1",
          "level": 6,
          "logger": "HelloWorld",
          "foo": "FOO",
          "thread": "main",
          "message": "Hello, world! åäö 1",
          "version": "1.1",
          "bar": "BAR",
          "@timestamp": "2017-04-28T10:28:06.263Z",
          "host": "udp-zlib",
          "@version": "1",
          "additionalField1": "constant value",
          "additionalField2": "foo bar"       
}

It seems that Logstash GELF input doesn't decode the GELF message with UTF-8 as it should according to GELF spec.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant