Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

org.yaml.snakeyaml.error.YAMLException: The incoming YAML document exceeds the limit: 3145728 code points. #7543

Closed
donv opened this issue Dec 28, 2022 · 12 comments
Milestone

Comments

@donv
Copy link
Member

donv commented Dec 28, 2022

Environment Information

jruby 9.3.9.0 (2.6.8) 2022-10-24 537cd1f OpenJDK 64-Bit Server VM 17.0.5+8-LTS on 17.0.5+8-LTS +jit [x86_64-darwin]
Darwin 21.6.0 Darwin Kernel Version 21.6.0: Sun Nov 6 23:31:16 PST 2022; root:xnu-8020.240.14~1/RELEASE_X86_64 x86_64

Expected Behavior

Using Psych::Parser to parse a large yaml file succeeds.

Actual Behavior

Using JRuby 9.3.9.0 parsing a large yaml file results in an exception:

 org.yaml.snakeyaml.error.YAMLException: The incoming YAML document exceeds the limit: 3145728 code points.

Reverting to jruby-9.3.8.0 works.

@chadlwilson
Copy link
Contributor

This is due to #7388 and the 3MB code point limit now honored in SnakeYaml 1.32 ( https://bitbucket.org/snakeyaml/snakeyaml/wiki/Changes )

Without looking at the specifics, suspect this might require ruby/psych#579 to workaround by setting
https://javadoc.io/static/org.yaml/snakeyaml/1.32/org/yaml/snakeyaml/LoaderOptions.html#setCodePointLimit-int-

@donv
Copy link
Member Author

donv commented Jan 9, 2023

Yeah, this looks right.

Any way to set the limit before ruby/psych#579 is done?

@chadlwilson
Copy link
Contributor

Apart from some hardcore runtime bytecode manipulation, I don't think so. Sadly the LoaderOptions defaults appear hard coded and without any external way to change the default values JVM-wide (e.g static default holders, system properties).

@headius
Copy link
Member

headius commented Jan 13, 2023

I have added some of these methods in ruby/psych#613. This is a unilateral exposure of these properties only in the JRuby version, so we should try to work with the maintainers of the C extension and see if we can have the same API for both.

@headius
Copy link
Member

headius commented Jan 26, 2023

I've requested review for my changes in ruby/psych#613. I would also like to release psych 5.1 to incorporate the new SnakeYAML Engine in ruby/psych#612, so this is a good time to do it.

@enebo enebo added this to the JRuby 9.4.2.0 milestone Jan 31, 2023
opti added a commit to ua-parser/uap-ruby that referenced this issue Jan 31, 2023
opti added a commit to ua-parser/uap-ruby that referenced this issue Jan 31, 2023
opti added a commit to ua-parser/uap-ruby that referenced this issue Jan 31, 2023
* bump-gems-2023-01-31

* Sync with uap-core

* Simplify tests matrix

* Limit simplecov to a single test run

* Lock jruby version for tests

Ref jruby/jruby#7543

* Bump actions/checkout
@chadlwilson
Copy link
Contributor

chadlwilson commented Feb 8, 2023

Looks like ability to control this was merged in ruby/psych#613, released in Psych 5.1.0 which is part of JRuby 9.4.1.0 in #7626 so if that is all that is required on JRuby side, we might be able to update the milestone here (currently 9.4.2.0) and close this unless it needs backport)?

Also not sure if it needs some docs somewhere to show how to override them in normal use cases.

@headius
Copy link
Member

headius commented Feb 8, 2023

Oops, yup, this one should have been resolved as of 9.4.1.

@headius headius modified the milestones: JRuby 9.4.2.0, JRuby 9.4.1.0 Feb 8, 2023
@headius headius closed this as completed Feb 8, 2023
@headius
Copy link
Member

headius commented Feb 8, 2023

There are no docs for the new features and no tests. Perhaps you could come up with some? I must admit I do not know exactly what YAML constructs the various settings apply to.

@chadlwilson
Copy link
Contributor

The irony is that my/our use case of JRuby actually doesn't rely on YAML parsing, Psych or SnakeYAML at all - it's just that we use jruby-complete and I like reducing noise for the community from CVEs. Some other things I work on have direct SnakeYAML exposure so was familiar with some of the noise/risks in the area and interested in the overlap with JRuby world.

I wonder if @donv has something on the test side.

Where do you suggest the docs would sit? Within Psych? Jruby itself somewhere?

@donv
Copy link
Member Author

donv commented Feb 8, 2023

Hi!

Documenting the usage here since I could not find it anywhere else yet.

@parser = Psych::Parser.new
@parser.code_point_limit = 20_000_000

I have switched to JRuby 9.4.1.0 in development and it seems to work fine.

@headius
Copy link
Member

headius commented Feb 8, 2023

@donv thanks for confirming!

@headius
Copy link
Member

headius commented Feb 8, 2023

@chadlwilson Well that is a good question. There is no logic in rdoc to generate documentation from the Java extension, and any way we don't have the same feature in the C extension so putting it in the general psych docs would not work anyway.

@hsbt @tenderlove How should we handle this? Maybe we can collaborate to get these same config options supported in the C extension?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants