Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(cspell): sort and unique ignore lists #36009

Merged
merged 2 commits into from
Sep 29, 2024

Conversation

OnkarRuikar
Copy link
Contributor

The PR adds pre-commit hooks to sort and uniq the cSpell ignore list files.

@OnkarRuikar OnkarRuikar requested review from a team as code owners September 23, 2024 09:48
@OnkarRuikar OnkarRuikar requested review from pepelsbey and removed request for a team September 23, 2024 09:48
@github-actions github-actions bot added system [PR only] Infrastructure and configuration for the project size/m [PR only] 51-500 LoC changed merge conflicts 🚧 [PR only] labels Sep 23, 2024
Copy link
Contributor

This pull request has merge conflicts that must be resolved before it can be merged.

Copy link
Member

@Josh-Cena Josh-Cena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make this a CI check too? To avoid people bypassing it or committing from web interface.

@github-actions github-actions bot added size/l [PR only] 501-1000 LoC changed and removed merge conflicts 🚧 [PR only] size/m [PR only] 51-500 LoC changed labels Sep 24, 2024
@OnkarRuikar
Copy link
Contributor Author

To avoid people bypassing it or committing from web interface.

Let's not prevent people from contributing. Like other stuff, we can always auto-fix it at midnight.

@Josh-Cena
Copy link
Member

Josh-Cena commented Sep 24, 2024

Most people do not change cspell files. Those that do (usually to address the weekly spellcheck errors) should ensure that they are changed in a correct manner.

Our Prettier review bot is the best contribution model. No bad content goes into the master branch and there's minimal friction to both contributors and reviewers. I think it should be implemented for everything, although Markdownlint is slightly different because the effect is non-local: a redirect can cause fixes in many files that you didn't touch. I would still advocate for local fixable changes (i.e. errors caused by changed lines) to be reported by the bot, though. And surely people change the dictionary file about once a week, so we should do everything properly in one commit.

@OnkarRuikar
Copy link
Contributor Author

OnkarRuikar commented Sep 24, 2024

Most people do not change cspell files.

We mark the issues good first issues. So we should expect new contributors fixing typos.

How do you propose to handle this?

  1. Sugeest sorted changes as reviewdog comments.
  2. Create a new cspell list check workflow and fail it if files are not sorted properly and in logs ask the contributor to run the sort commands locally and push.

@Josh-Cena
Copy link
Member

Josh-Cena commented Sep 24, 2024

Unfortunately I would expect them to fix obvious typos but not, say, add things to dictionaries. It's too much of a judgment call to tell what to do with each word and I'm afraid, putting myself into the shoes of a new contributor, they would not have the courage to do. For example, who knows if I should add greetme to the dictionary or if I should change it to greetMe? Who knows if this abbreviation is actually a standard and/or intentional? Again I think we are too optimistic about how much freedom new contributors are willing to exercise—when I'm triaging issues, I do not put up "good first issue" unless new contributors know exactly which line to edit and what to change it to, without using any of their own discretion.

Ideally it would be reviewdog. But the changes are too non-local and result in proposed diff located in uncommentable areas. So, the workflow could output a whole dictionary file and ask them to copy&paste, or it could output a diff that we can apply (since ideally it would only be a few lines added each time anyway). What I don't want is two PRs every week, one called "fix typos" another called "fix dictionary file", both touching the same file. Do it once and do it right.

@OnkarRuikar OnkarRuikar marked this pull request as draft September 24, 2024 02:07
@OnkarRuikar
Copy link
Contributor Author

Agree. Fixing typos in code is not good first worthy. I'll remove the good first label.

It's not possible to comment on unedited lines. I'll create a new check workflow that will run when the list files are edited.

Copy link
Contributor

This pull request has merge conflicts that must be resolved before it can be merged.

@github-actions github-actions bot added merge conflicts 🚧 [PR only] Content:CSS Cascading Style Sheets docs Content:HTML Hypertext Markup Language docs Content:WebExt WebExtensions docs Content:WebAPI Web API docs Content:JS JavaScript docs Content:Learn Learning area docs Content:HTTP HTTP docs Content:SVG SVG docs and removed merge conflicts 🚧 [PR only] labels Sep 28, 2024
@github-actions github-actions bot added Content:MathML MathML docs Content:Glossary Glossary entries Content:Firefox Content in the Mozilla/Firefox subtree Content:Learn:HTML Learning area HTML docs Content:Manifest labels Sep 28, 2024
@github-actions github-actions bot removed Content:CSS Cascading Style Sheets docs Content:HTML Hypertext Markup Language docs Content:WebExt WebExtensions docs Content:WebAPI Web API docs Content:JS JavaScript docs Content:Learn Learning area docs Content:HTTP HTTP docs Content:SVG SVG docs Content:MathML MathML docs Content:Glossary Glossary entries Content:Firefox Content in the Mozilla/Firefox subtree Content:Learn:HTML Learning area HTML docs Content:Manifest labels Sep 28, 2024
@OnkarRuikar OnkarRuikar marked this pull request as ready for review September 28, 2024 06:49
Comment on lines 22 to 24
a = a.toUpperCase();
b = b.toUpperCase();
return a < b ? -1 : a > b ? 1 : 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use Intl.Collator with sensitivity: "accent"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The collator sounds good but cSpell doesn't seem use collator for comparisons. So after formatting the files using collator, I am getting the following errors from cspell because now the code treets 'acosh' = '𝚊𝚌𝚘𝚜𝚑'.

files/en-us/web/javascript/reference/global_objects/intl/displaynames/of/index.md:28:153 - Unknown word (nonterminal)
files/en-us/web/javascript/reference/global_objects/intl/displaynames/of/index.md:68:48 - Unknown word (canadien)
files/en-us/web/javascript/reference/global_objects/math/acosh/index.md:14:300 - Unknown word (𝚊𝚌𝚘𝚜𝚑)
files/en-us/web/javascript/reference/global_objects/math/asinh/index.md:14:204 - Unknown word (𝚊𝚜𝚒𝚗𝚑)
files/en-us/web/javascript/reference/global_objects/math/atanh/index.md:14:397 - Unknown word (𝚊𝚝𝚊𝚗𝚑)
files/en-us/web/javascript/reference/global_objects/math/cbrt/index.md:14:83 - Unknown word (𝚌𝚋𝚛𝚝)
files/en-us/web/javascript/reference/global_objects/math/expm1/index.md:14:83 - Unknown word (𝚎𝚡𝚙𝚖)
files/en-us/web/javascript/reference/global_objects/math/hypot/index.md:14:109 - Unknown word (𝚑𝚢𝚙𝚘𝚝)
files/en-us/web/javascript/reference/global_objects/math/log/index.md:14:182 - Unknown word (𝚕𝚘𝚐)
files/en-us/web/javascript/reference/global_objects/math/log10/index.md:14:182 - Unknown word (𝚕𝚘𝚐)
files/en-us/web/javascript/reference/global_objects/math/log1p/index.md:14:192 - Unknown word (𝚕𝚘𝚐)
files/en-us/web/javascript/reference/global_objects/math/log2/index.md:14:182 - Unknown word (𝚕𝚘𝚐)
files/en-us/web/javascript/reference/global_objects/math/sqrt/index.md:14:179 - Unknown word (𝚜𝚚𝚛𝚝)

The MathML pages use non-ASCII characters(a-zA-Z), that look like ASCII, in the formulas. I don't know the reason for using non ASCII abcd characters in the MathML code. Need to ask @fred-wang. Should we change those to make ASCII?

Copy link
Member

@Josh-Cena Josh-Cena Sep 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MathML formulae often use the monospace Latin characters. Including JS pages I wrote, because I converted them using TeXZilla, so yes I'd be curious to hear what Frédéric says, including whether it's possible to make TeXZilla use inline CSS instead.

I'm surprised that 𝚌𝚋𝚛𝚝 and cbrt are considered equal though. Maybe we should keep using toLowerCase then—just change the regex to use a.toLowerCase() === b.toLowerCase().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I've used toLowerCase for all the comparisons.

scripts/sort_and_unique_file_lines.js Outdated Show resolved Hide resolved
Copy link
Member

@Josh-Cena Josh-Cena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@Josh-Cena Josh-Cena merged commit e967e5a into mdn:main Sep 29, 2024
9 of 10 checks passed
@OnkarRuikar OnkarRuikar deleted the ci_sort_ignore_lists branch September 29, 2024 05:03
@OnkarRuikar
Copy link
Contributor Author

PR #36109 tested this PR check workflow.

  • I created the Fix typos #36109 PR using the GitHub web interface.
  • New words were added at the beginning of the ignore-list.txt.
  • The workflow failed with the following error:

    The file is not formatted properly. Run 'node scripts/sort_and_unique_file_lines.js .vscode/ignore-list.txt' to format the file.
    Error: Process completed with exit code 1.

  • Then I ran git fetch origin patch-3:patch-3 on my PC to get the PR branch.
  • Then ran the suggested command to format the list file: node scripts/sort_and_unique_file_lines.js .vscode/ignore-list.txt
  • Then pushed the changes using git push -u origin patch-3.

fiji-flo pushed a commit that referenced this pull request Oct 2, 2024
* fix(cspell): sort and uniq ignore lists

* use toLowerCase for comparisons
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/l [PR only] 501-1000 LoC changed system [PR only] Infrastructure and configuration for the project
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants