-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix the limitation of the size of yaml file that exceeds 3MB #97
Fix the limitation of the size of yaml file that exceeds 3MB #97
Conversation
… size of the yaml file in `dictionary_path` to overcome the 3MB size limit from SnakeYaml 1.33 Fixed: logstash-plugins#96
dictionary_file_max_bytes
to config the maximum bytes…
docs/index.asciidoc
Outdated
===== `dictionary_file_max_bytes` | ||
|
||
* Value type is <<number,number>> | ||
* Default value is 3145728 (3MB) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that there didn't use to be a limit, and I don't think disabling limit checking it is an option (or is it?), I'm tempted to put a very high number here, like 128MB or even 1GB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no option to disable the limit check. I put it 128MB
Changes overall LGTM by agreeing @jsvd 's comments. Reference thread: link |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few leftovers from the max_bytes setting.
Also, I tried locally with refresh_interval
and increasing the dictionary over the code point limit, and the reloader thread crashed:
[2023-05-11T23:26:31,445][ERROR][logstash.filters.translate][main] Scheduler intercepted an error: {:exception=>NoMethodError, :message=>"private method `warn' called for nil:NilClass", :backtrace=>["/Users/joaoduarte/elastic/logstash-plugins/logstash-filter-translate/lib/logstash/filters/dictionary/file.rb:123:in `loading_exception'", "/Users/joaoduarte/elastic/logstash-plugins/logstash-filter-translate/lib/logstash/filters/dictionary/file.rb:60:in `load_dictionary'",
The two @logger.warn
in file.rb need to be changed to just logger.warn
to produce the right error message:
[2023-05-11T23:22:42,793][WARN ][logstash.filters.dictionary.yamlfile][main] Translate: The incoming YAML document exceeds the limit: 100 code points. when loading dictionary file at /tmp/dict.yml, continuing with old dictionary {:dictionary_path=>"/tmp/dict.yml"}
Co-authored-by: João Duarte <[email protected]>
Co-authored-by: João Duarte <[email protected]>
Co-authored-by: João Duarte <[email protected]>
Co-authored-by: João Duarte <[email protected]>
Co-authored-by: João Duarte <[email protected]>
Co-authored-by: João Duarte <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a suggestion to avoid leaking the yaml code point limit setting to the base dictionary file initialize
and initialize_for_file_type
method signatures, by using an optional argument hash.
Co-authored-by: João Duarte <[email protected]>
Co-authored-by: João Duarte <[email protected]>
Co-authored-by: João Duarte <[email protected]>
Co-authored-by: João Duarte <[email protected]>
Co-authored-by: João Duarte <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Fixed: #96
added setting
yaml_dictionary_code_point_limit
to config the maximum code point limit of the yaml file indictionary_path
to overcome the 3MB size limit from SnakeYaml 1.33. This setting is only effective for yaml. Yaml file over the size limit throws an exception. JSON and CSV currently do not have such restriction. The default value ofyaml_dictionary_code_point_limit
is 128MB.How to test