Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The incoming YAML document exceeds the limit: 3145728 code points #96

Closed
SnDsound opened this issue Feb 24, 2023 · 4 comments · Fixed by #97
Closed

The incoming YAML document exceeds the limit: 3145728 code points #96

SnDsound opened this issue Feb 24, 2023 · 4 comments · Fixed by #97
Assignees
Labels

Comments

@SnDsound
Copy link

Hi folks,

Description of the problem including expected versus actual behavior:
After change introduced in version 8.6.1 (Updated snakeyaml to 1.33 #14848) my logstash stopped working. This change is introducing limit of 3MB for YAML file due to CVE. I'm using translate filter plugin, with large YAML files as input. In version 8.6.0 everything works, because there is no file limit. In version 8.6.2 pipeline is not loading correctly.

Logstash information:

  1. Logstash version: 8.6.2
  2. Logstash installation source: Docker
  3. How is Logstash being run: Docker

Plugins installed: (bin/logstash-plugin list --verbose)
logstash-filter-translate (3.4.0)

Steps to reproduce:

Use large YAML file (25 MB in my case) with translate plugin:

translate {
    id => "filter_translate_123456"
    source => "something.ip"
    target => "something.name"
    exact => "true"
    refresh_interval => 0
    refresh_behaviour => "replace"
    dictionary_path => '/usr/share/logstash/files/largefile.yaml'
}

Provide logs (if relevant):

[2023-02-24T14:58:10,950][ERROR][logstash.javapipeline    ][pipeline_translate] Pipeline error {:pipeline_id=>"pipeline_translate", :exception=>#<LogStash::Filters::Dictionary::DictionaryFileError: Translate: The incoming YAML document exceeds the limit: 3145728 code points. when loading dictionary file at /usr/share/logstash/files/largefile.yaml>, :backtrace=>["org.yaml.snakeyaml.scanner.ScannerImpl.fetchMoreTokens(ScannerImpl.java:342)", "org.yaml.snakeyaml.scanner.ScannerImpl.checkToken(ScannerImpl.java:263)", "org.yaml.snakeyaml.parser.ParserImpl$ParseBlockMappingKey.produce(ParserImpl.java:662)", "org.yaml.snakeyaml.parser.ParserImpl.peekEvent(ParserImpl.java:185)", "org.yaml.snakeyaml.parser.ParserImpl.getEvent(ParserImpl.java:195)", "org.jruby.ext.psych.PsychParser.parse(PsychParser.java:210)", "org.jruby.ext.psych.PsychParser$INVOKER$i$parse.call(PsychParser$INVOKER$i$parse.gen)", "usr.share.logstash.vendor.jruby.lib.ruby.stdlib.psych.RUBY$method$parse_stream$0(/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/psych.rb:460)", "usr.share.logstash.vendor.bundle.jruby.$2_dot_6_dot_0.gems.logstash_minus_filter_minus_translate_minus_3_dot_4_dot_0.lib.logstash.filters.dictionary.yaml_file.RUBY$method$read_file_into_dictionary$0(/usr/share/logstash/vendor/bundle/jruby/2.6.0/gems/logstash-filter-translate-3.4.0/lib/logstash/filters/dictionary/yaml_file.rb:19)", "usr.share.logstash.vendor.bundle.jruby.$2_dot_6_dot_0.gems.logstash_minus_filter_minus_translate_minus_3_dot_4_dot_0.lib.logstash.filters.dictionary.file.RUBY$method$merge_dictionary$0(/usr/share/logstash/vendor/bundle/jruby/2.6.0/gems/logstash-filter-translate-3.4.0/lib/logstash/filters/dictionary/file.rb:84)", "org.jruby.internal.runtime.methods.CompiledIRMethod.call(CompiledIRMethod.java:152)", "org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:148)", "org.jruby.RubyMethod.call(RubyMethod.java:116)", "usr.share.logstash.vendor.bundle.jruby.$2_dot_6_dot_0.gems.logstash_minus_filter_minus_translate_minus_3_dot_4_dot_0.lib.logstash.filters.dictionary.file.RUBY$method$load_dictionary$0(/usr/share/logstash/vendor/bundle/jruby/2.6.0/gems/logstash-filter-translate-3.4.0/lib/logstash/filters/dictionary/file.rb:56)", "usr.share.logstash.vendor.bundle.jruby.$2_dot_6_dot_0.gems.logstash_minus_filter_minus_translate_minus_3_dot_4_dot_0.lib.logstash.filters.dictionary.file.RUBY$method$initialize$0(/usr/share/logstash/vendor/bundle/jruby/2.6.0/gems/logstash-filter-translate-3.4.0/lib/logstash/filters/dictionary/file.rb:50)", "usr.share.logstash.vendor.bundle.jruby.$2_dot_6_dot_0.gems.logstash_minus_filter_minus_translate_minus_3_dot_4_dot_0.lib.logstash.filters.dictionary.file.RUBY$method$create$0(/usr/share/logstash/vendor/bundle/jruby/2.6.0/gems/logstash-filter-translate-3.4.0/lib/logstash/filters/dictionary/file.rb:14)", "usr.share.logstash.vendor.bundle.jruby.$2_dot_6_dot_0.gems.logstash_minus_filter_minus_translate_minus_3_dot_4_dot_0.lib.logstash.filters.translate.RUBY$method$register$0(/usr/share/logstash/vendor/bundle/jruby/2.6.0/gems/logstash-filter-translate-3.4.0/lib/logstash/filters/translate.rb:184)", "org.jruby.internal.runtime.methods.CompiledIRMethod.call(CompiledIRMethod.java:152)", "org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:148)", "org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:210)", "org.jruby.RubyClass.finvoke(RubyClass.java:572)", "org.jruby.runtime.Helpers.invoke(Helpers.java:649)", "org.jruby.RubyBasicObject.callMethod(RubyBasicObject.java:348)", "org.logstash.config.ir.compiler.FilterDelegatorExt.doRegister(FilterDelegatorExt.java:88)", "org.logstash.config.ir.compiler.AbstractFilterDelegatorExt.register(AbstractFilterDelegatorExt.java:75)", "usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.RUBY$block$register_plugins$1(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:234)", "org.jruby.runtime.CompiledIRBlockBody.yieldDirect(CompiledIRBlockBody.java:151)", "org.jruby.runtime.BlockBody.yield(BlockBody.java:106)", "org.jruby.runtime.Block.yield(Block.java:188)", "org.jruby.RubyArray.each(RubyArray.java:1865)", "usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.RUBY$method$register_plugins$0(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:233)", "org.jruby.internal.runtime.methods.CompiledIRMethod.call(CompiledIRMethod.java:165)", "org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:185)", "org.jruby.ir.targets.indy.InvokeSite.fail(InvokeSite.java:278)", "usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.RUBY$method$maybe_setup_out_plugins$0(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:601)", "usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.RUBY$method$maybe_setup_out_plugins$0$__VARARGS__(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:598)", "org.jruby.internal.runtime.methods.CompiledIRMethod.call(CompiledIRMethod.java:139)", "org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:112)", "org.jruby.ir.targets.indy.InvokeSite.fail(InvokeSite.java:248)", "org.jruby.ir.targets.indy.InvokeSite.fail(InvokeSite.java:255)", "usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.RUBY$method$start_workers$0(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:246)", "usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.RUBY$method$start_workers$0$__VARARGS__(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:242)", "org.jruby.internal.runtime.methods.CompiledIRMethod.call(CompiledIRMethod.java:139)", "org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:112)", "org.jruby.ir.targets.indy.InvokeSite.fail(InvokeSite.java:248)", "org.jruby.ir.targets.indy.InvokeSite.fail(InvokeSite.java:255)", "usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.RUBY$method$run$0(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:191)", "usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.RUBY$method$run$0$__VARARGS__(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:186)", "org.jruby.internal.runtime.methods.CompiledIRMethod.call(CompiledIRMethod.java:139)", "org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:112)", "org.jruby.ir.targets.indy.InvokeSite.fail(InvokeSite.java:248)", "org.jruby.ir.targets.indy.InvokeSite.fail(InvokeSite.java:255)", "usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.RUBY$block$start$1(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:143)", "org.jruby.runtime.CompiledIRBlockBody.callDirect(CompiledIRBlockBody.java:141)", "org.jruby.runtime.IRBlockBody.call(IRBlockBody.java:64)", "org.jruby.runtime.IRBlockBody.call(IRBlockBody.java:58)", "org.jruby.runtime.Block.call(Block.java:143)", "org.jruby.RubyProc.call(RubyProc.java:309)", "org.jruby.internal.runtime.RubyRunnable.run(RubyRunnable.java:107)", "java.base/java.lang.Thread.run(Thread.java:833)"], "pipeline.sources"=>["/usr/share/logstash/pipeline/syslog.conf"], :thread=>"#<Thread:0x4a3de69b run>"}
[2023-02-24T14:58:10,950][INFO ][logstash.javapipeline    ][pipeline_translate] Pipeline terminated {"pipeline.id"=>"pipeline_translate"}
[2023-02-24T14:58:10,958][ERROR][logstash.agent           ] Failed to execute action {:id=>:pipeline_translate, :action_type=>LogStash::ConvergeResult::FailedAction, :message=>"Could not execute action: PipelineAction::Create<pipeline_translate>, action_result: false", :backtrace=>nil}
@SnDsound SnDsound added the bug label Feb 24, 2023
@MikeKemmerer
Copy link

Thank you for identifying this. We use the translate filter with large yaml files, and this is a dealbreaker for upgrading.

@nicpenning
Copy link

We are stuck on 8.3.2 due to this as well. Please investigate!

@nicpenning
Copy link

We migrated this particular pipeline to an Elasticsearch ingest node and used an enrich pipeline instead. We could then upgrade from 8.3.2 to 8.7.0 without any major side effects. Hope this helps others out there!

@roaksoax roaksoax assigned kaisecheng and unassigned andsel May 2, 2023
@kaisecheng
Copy link
Contributor

kaisecheng commented May 4, 2023

It is an issue related to jruby psych and got a workaround in jruby 9.4.1.0

The fix for this plugin wiill need update Psych usage

kaisecheng added a commit that referenced this issue May 12, 2023
added setting `yaml_dictionary_code_point_limit` to config the maximum code point limit of the yaml file in `dictionary_path` to overcome the 3MB size limit from SnakeYaml 1.33. This setting is only effective for yaml. Yaml file over the size limit throws an exception. JSON and CSV currently do not have such restriction. The default value of yaml_dictionary_code_point_limit is 128MB.

Fixed: #96
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants