Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OxfordComma missing support for multiword items #30

Open
tcmetzger opened this issue Aug 13, 2020 · 3 comments
Open

OxfordComma missing support for multiword items #30

tcmetzger opened this issue Aug 13, 2020 · 3 comments
Labels
question Further information is requested

Comments

@tcmetzger
Copy link
Contributor

The current regex for detecting missing Oxford Commas seems to only detect sentences where the last item in the list before the "and"/"or" consists of one single word.

For example:
"Save your file to a hard drive, an external drive or OneDrive." Vale seems to not detect the missing comma after 'drive' (example from https://docs.microsoft.com/en-us/style-guide/punctuation/commas). However, if the sentence is "Save your file to a hard drive, OneDrive or an external drive", Vale will correctly detect the missing comma.

@tcmetzger
Copy link
Contributor Author

tcmetzger commented Aug 13, 2020

I would suggest adding whitespaces (\s) to the regex: '(?:[^,]+,){1,}\s[\w\s]+\s(and|or)\s'. Does that make sense? If so, I'd be happy to open a pull request with this ammendment!

in: '(?:[^,]+,){1,}\s[\w\s]+\s(and|or)\s'
out: '(?:[^,]+,){1,}\s\w+\s(and|or)'
(This would already include changes from #29.)

@jdkato
Copy link
Member

jdkato commented Aug 14, 2020

While it's true that we currently only check for a specific case, the problem with trying to be more generic is that you'll get a lot of false positives.

I ran your suggested pattern on this file and saw these (among other) results:

This explicitness on your part, which is up to you to maintain with discipline, will save you lots of refactor headaches and footguns down the line.

Remember: one of the most important roles for source code is to communicate clearly, not only to you, but your future self and other code collaborators, what your intent is.

But again, other than our intuitions and sensibilities, there doesn't appear to be objective and clear measure of what constitutes "accidents" or prevention thereof.

This isn't to say that the rule can't be improved, but I think adding \s is too generic.

@jdkato jdkato added the question Further information is requested label Aug 14, 2020
@tcmetzger
Copy link
Contributor Author

I see your point, this regex could certainly use some more work. Are there any regex-experts on this project? I'll also keep tweaking and testing some more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants