Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add local entity expansion limit to REXML::Parsers::StreamParser #192

Closed
otegami opened this issue Aug 4, 2024 · 1 comment · Fixed by #202
Closed

Add local entity expansion limit to REXML::Parsers::StreamParser #192

otegami opened this issue Aug 4, 2024 · 1 comment · Fixed by #202

Comments

@otegami
Copy link
Contributor

otegami commented Aug 4, 2024

Currently, the REXML allows changing the entity expansion text limit globally via REXML::Security.entity_expansion_text_limit. This global setting might unintentionally affect all parsing operations within the application, potentially introducing side effects in parts of the system where a lower limit is preferable for maintaining security.

Real-world Use Case

While processing a large XML dataset related to Wikipedia articles, we faced a situation where it was necessary to temporarily increase the entity expansion text limit for specific parsing operations involving large data elements. The requirement to adjust this limit globally, due to the global nature of the current setting, was not ideal.

ref; red-data-tools/red-datasets#198

Proposed

I propose the introduction of an instance-specific method to set the entity expansion text limit directly on instances of REXML::Parsers::StreamParser. This method would allow developers to adjust the limit for individual parser instances, thus not impacting the global configuration.

parser = REXML::Parsers::StreamParser.new(entry.read, listener)
parser.entity_expansion_text_limit = 163_840
parser.parse

Adding this feature would provide the following benefits.

  • Enhanced Security: Allows localized adjustments of the limit, preventing broad changes that could lower the security posture of applications.
  • Increased Flexibility: Developers can tailor parser behavior to specific needs without affecting other parts of the application.
  • Reduced Risk of Side Effects: Confines changes to the intended scope, reducing the likelihood of unexpected behavior in unrelated parts of an application.
@naitoh
Copy link
Contributor

naitoh commented Aug 17, 2024

@otegami
Thanks for this proposed.

ref; red-data-tools/red-datasets#198

BTW,
I think the above case is caused by #195.

I think it was resolved in rexml 3.3.5.

naitoh added a commit to naitoh/rexml that referenced this issue Aug 22, 2024
naitoh added a commit to naitoh/rexml that referenced this issue Aug 22, 2024
naitoh added a commit to naitoh/rexml that referenced this issue Aug 22, 2024
naitoh added a commit to naitoh/rexml that referenced this issue Aug 22, 2024
naitoh added a commit to naitoh/rexml that referenced this issue Aug 22, 2024
naitoh added a commit to naitoh/rexml that referenced this issue Aug 23, 2024
naitoh added a commit to naitoh/rexml that referenced this issue Aug 23, 2024
naitoh added a commit to naitoh/rexml that referenced this issue Aug 26, 2024
naitoh added a commit to naitoh/rexml that referenced this issue Aug 26, 2024
## Why?
See: ruby#192

---------

Co-authored-by: Sutou Kouhei <[email protected]>
@kou kou closed this as completed in #202 Aug 26, 2024
@kou kou closed this as completed in caec187 Aug 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

2 participants