Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[\W\D] fails to match alphabetic characters #241

Open
muellerj2 opened this issue Jan 20, 2025 · 0 comments
Open

[\W\D] fails to match alphabetic characters #241

muellerj2 opened this issue Jan 20, 2025 · 0 comments

Comments

@muellerj2
Copy link

The (ECMAScript) regular expression [\W\D] describes a character class that matches the union of (a) all non-alphanumeric characters and (b) all non-digits. So effectively, should be equivalent to [\D] and thus match all non-digits. However, Boost.Regex actually matches only non-alphanumeric characters.

Test case:

#include <iostream>
#include <boost/regex.hpp>

using namespace boost;

int main()
{
    regex re(R"([\W\D])");
    std::cout << "matches alphabetic: " << regex_match("a", re) << '\n'
         << "matches digit: " << regex_match("0", re) << '\n' 
         << "matches non-alphanumeric: " << regex_match(".", re);
    
    return 0;
}

https://godbolt.org/z/jPf79j5nr

This prints:

matches alphabetic: 0
matches digit: 0
matches non-alphanumeric: 1

But it should print:

matches alphabetic: 1
matches digit: 0
matches non-alphanumeric: 1

I think the problem lies here:

void add_negated_class(m_type m)
{
m_negated_classes |= m;
m_empty = false;
}

The negated character classes are bitwise or'ed, but De Morgan's law says that (not w) or (not d) = not (w and d), so the bit masks should really be bitwise and'ed.

But bitwise and'ing would be problematic as well, because no requirement is placed on traits classes that and'ing the character class bit masks corresponds to the intersection of the character classes. I guess and'ing will probably still work for traits classes provided by Boost.Regex (although I haven't checked that), but it's not guaranteed to do the right thing for user-provided traits classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant