Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected Charset Possible for Currency Symbol #4279

Merged
merged 8 commits into from
Dec 23, 2024

Conversation

oleibman
Copy link
Collaborator

We do not recommend it, but users can call the Php function setlocale and that might affect the character used as currency symbol. A problem arises when the caller to setlocale does not specify a character set - in that case, Php will attempt to return its localeconv() values in a single-byte character set rather than UTF-8. This is particularly problematic for currency symbols. PhpSpreadsheet till now has accepted such a character, and that can lead to corrupt spreadsheets. It is changed to validate the currency symbol as UTF-8, and fall back to a different choice if not (e.g. EUR rather than 0x80, which is how the euro symbol is depicted in Win-1252).

An additional problem arises because Linux systems seem to return the alternate symbol with a trailing blank, but Windows systems do not. To allow callers to get a consistent result, a parameter is added to getCurrencyCode which will trim or not (default) the currency code.

This is:

  • a bugfix
  • a new feature
  • refactoring
  • additional unit tests

Checklist:

  • Changes are covered by unit tests
    • Changes are covered by existing unit tests
    • New unit tests have been added
  • Code style is respected
  • Commit message explains why the change is made (see https://github.com/erlang/otp/wiki/Writing-good-commit-messages)
  • CHANGELOG.md contains a short summary of the change and a link to the pull request if applicable
  • Documentation is updated as necessary

Why this change is needed?

Provide an explanation of why this change is needed, with links to any Issues (if appropriate).
If this is a bugfix or a new feature, and there are no existing Issues, then please also create an issue that will make it easier to track progress with this PR.

We do not recommend it, but users can call the Php function `setlocale` and that might affect the character used as currency symbol. A problem arises when the caller to setlocale does not specify a character set - in that case, Php will attempt to return its `localeconv()` values in a single-byte character set rather than UTF-8. This is particularly problematic for currency symbols. PhpSpreadsheet till now has accepted such a character, and that can lead to corrupt spreadsheets. It is changed to validate the currency symbol as UTF-8, and fall back to a different choice if not (e.g. EUR rather than 0x80, which is how the euro symbol is depicted in Win-1252).

An additional problem arises because Linux systems seem to return the alternate symbol with a trailing blank, but Windows systems do not. To allow callers to get a consistent result, a parameter is added to `getCurrencyCode` which will trim or not (default) the currency code.
@oleibman
Copy link
Collaborator Author

False positive from Scrutinizer, marked to ignore.

@oleibman oleibman added this pull request to the merge queue Dec 23, 2024
Merged via the queue into PHPOffice:master with commit 687d87c Dec 23, 2024
12 of 14 checks passed
@oleibman oleibman deleted the currsymbol branch December 23, 2024 17:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

1 participant