builtin: fix `assert '_ISspace'.camel_to_snake() == '_i_sspace'` #21736

ttytm · 2024-06-26T04:57:18Z

spytheman · 2024-06-26T05:18:59Z

vlib/builtin/string_test.v

@@ -1529,6 +1529,7 @@ fn test_camel_to_snake() {
 	assert 'BBaa'.camel_to_snake() == 'b_baa'
 	assert 'aa_BB'.camel_to_snake() == 'aa_bb'
 	assert 'JVM_PUBLIC_ACC'.camel_to_snake() == 'jvm_public_acc'
+	assert '_ISspace'.camel_to_snake() == '_i_sspace'


Is not IS a separate word from space?
But then BBaa became b_baa 🤔, so it is consistent.

I have these in my /usr/include/ctype.h:

# include <bits/endian.h> # if __BYTE_ORDER == __BIG_ENDIAN # define _ISbit(bit) (1 << (bit)) # else /* __BYTE_ORDER == __LITTLE_ENDIAN */ # define _ISbit(bit) ((bit) < 8 ? ((1 << (bit)) << 8) : ((1 << (bit)) >> 8)) # endif enum { _ISupper = _ISbit (0), /* UPPERCASE. */ _ISlower = _ISbit (1), /* lowercase. */ _ISalpha = _ISbit (2), /* Alphabetic. */ _ISdigit = _ISbit (3), /* Numeric. */ _ISxdigit = _ISbit (4), /* Hexadecimal numeric. */ _ISspace = _ISbit (5), /* Whitespace. */ _ISprint = _ISbit (6), /* Printing. */ _ISgraph = _ISbit (7), /* Graphical. */ _ISblank = _ISbit (8), /* Blank (usually SPC and TAB). */ _IScntrl = _ISbit (9), /* Control character. */ _ISpunct = _ISbit (10), /* Punctuation. */ _ISalnum = _ISbit (11) /* Alphanumeric. */ }; #endif /* ! _ISbit */

And later:

/usr/include/ctype.h:197:# define isspace(c) __isctype((c), _ISspace)

I think the intended usage of the macro is as a predicate (is space), for checking whether a letter is a space character, an ASCII code of 32 (2^5).

Right, a conversion to _is_space being more intuitive is the impression I'm getting too.

As you mention, the implementation should be updated to be consistent then.

Likely obvious but to have it noted: Since we handle the first separately, the general implementation for capitals followed by lowercase characters should be updated then. aaBBc currently becomes aa_b_bc, while aa_bb_c would be consistent.

I think the rule may be that if there is camelCase, i.e. a single capital letter C, followed by lower case letters, then the capital is part of the following word.

However, if there are several capitals one after the other, then the whole span of capitals, is its own independent word (perhaps from an acronym, or an abbreviation, or the product of a deranged mind), and the following lower case letters form their own independent word.

Doing a short search for commonalities and conventions regarding camel to snake conversions, it would be _i_sspace for a most widely used rust crate too:

https://crates.io/crates/convert_case

use convert_case::{Case}; fn main() { let snake_str = "_ISspace".from_case(Case::Camel).to_case(Case::Snake); dbg!(snake_str); // `[main.rs:5:5] snake_str = "_i_sspace"` let snake_str = "AAbb".from_case(Case::Camel).to_case(Case::Snake); dbg!(snake_str); // `[main.rs:7:5] snake_str = "a_abb"` }

If it's time to break a convention, maybe first merge the fix to have it in the history for a potential rollback, then make the change?

Looks like Rust took the "easy" way out... counting the capitals is a bit more work.

I'd prefer the method that looks more correct... it can always be changed if it causes problems.

If it's time to break a convention, maybe first merge the fix to have it in the history for a potential rollback, then make the change?

Yes, that is good idea. The version on master is definitely broken, since it loses a letter, while the version here does not.

…ng#21736)

ttytm added 2 commits June 26, 2024 06:54

fix

b10b011

test

2e596c1

spytheman reviewed Jun 26, 2024

View reviewed changes

spytheman merged commit 6ecfc6f into vlang:master Jun 27, 2024
76 checks passed

ttytm deleted the builtin/fix-snake-to-camel2 branch June 27, 2024 07:02

ttytm mentioned this pull request Jun 28, 2024

builtin: improve snake to camel case conversion #21755

Merged

raw-bin pushed a commit to raw-bin/v that referenced this pull request Jul 2, 2024

builtin: fix assert '_ISspace'.camel_to_snake() == '_i_sspace' (vla…

6019991

…ng#21736)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

builtin: fix `assert '_ISspace'.camel_to_snake() == '_i_sspace'` #21736

builtin: fix `assert '_ISspace'.camel_to_snake() == '_i_sspace'` #21736

ttytm commented Jun 26, 2024 •

edited

Loading

spytheman Jun 26, 2024

spytheman Jun 26, 2024

ttytm Jun 26, 2024 •

edited

Loading

spytheman Jun 26, 2024

ttytm Jun 26, 2024 •

edited

Loading

ttytm Jun 26, 2024 •

edited

Loading

JalonSolov Jun 26, 2024

spytheman Jun 27, 2024

builtin: fix assert '_ISspace'.camel_to_snake() == '_i_sspace' #21736

builtin: fix assert '_ISspace'.camel_to_snake() == '_i_sspace' #21736

Conversation

ttytm commented Jun 26, 2024 • edited Loading

spytheman Jun 26, 2024

Choose a reason for hiding this comment

spytheman Jun 26, 2024

Choose a reason for hiding this comment

ttytm Jun 26, 2024 • edited Loading

Choose a reason for hiding this comment

spytheman Jun 26, 2024

Choose a reason for hiding this comment

ttytm Jun 26, 2024 • edited Loading

Choose a reason for hiding this comment

ttytm Jun 26, 2024 • edited Loading

Choose a reason for hiding this comment

JalonSolov Jun 26, 2024

Choose a reason for hiding this comment

spytheman Jun 27, 2024

Choose a reason for hiding this comment

builtin: fix `assert '_ISspace'.camel_to_snake() == '_i_sspace'` #21736

builtin: fix `assert '_ISspace'.camel_to_snake() == '_i_sspace'` #21736

ttytm commented Jun 26, 2024 •

edited

Loading

ttytm Jun 26, 2024 •

edited

Loading

ttytm Jun 26, 2024 •

edited

Loading

ttytm Jun 26, 2024 •

edited

Loading