Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the Node/Item XDM Relationship #3409

Merged
merged 7 commits into from
Oct 10, 2024

Conversation

adamretter
Copy link
Contributor

@adamretter adamretter commented May 12, 2020

Small enhancement built on #3363

But... it changes the on-disk format of the Native Value Index (values.dbx), and so requires a major release.

@adamretter adamretter added the enhancement new features, suggestions, etc. label May 12, 2020
@adamretter adamretter added this to the eXist-6.0.0 milestone May 12, 2020
@adamretter adamretter requested a review from a team May 12, 2020 19:07
line-o
line-o previously requested changes May 12, 2020
Copy link
Member

@line-o line-o left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also revert the integer value of NO_SUCH_VALUE back to -1

@adamretter
Copy link
Contributor Author

@line-o I don't think that needs to be done.

@line-o
Copy link
Member

line-o commented May 12, 2020

It makes much more sense to have it as -1, when it is the only negative value. Comparable to ['a' , 'b'].indexOf('c')

@adamretter
Copy link
Contributor Author

@line-o That's one perspective... feel free to send a follow-up PR.

@dizzzz
Copy link
Member

dizzzz commented Jun 15, 2020

What is the status of the PR?

@joewiz
Copy link
Member

joewiz commented Jan 12, 2021

@dizzzz When you have the chance could you restart the AppVeyor CI on this PR? Thanks!

@dizzzz
Copy link
Member

dizzzz commented Jan 13, 2021

sorry...
image

probably if we rebase, a new build is triggered

@adamretter adamretter force-pushed the hotfix/number-comparisons-2 branch from 4c2eb67 to c4542b0 Compare March 20, 2021 21:05
@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

0.0% 0.0% Coverage
0.0% 0.0% Duplication

@adamretter adamretter modified the milestones: eXist-6.0.0, eXist-7.0.0 Feb 14, 2022
@adamretter adamretter force-pushed the hotfix/number-comparisons-2 branch from c4542b0 to 1201eeb Compare February 5, 2023 21:31
@line-o
Copy link
Member

line-o commented Feb 13, 2023

NO_SUCH_VALUE was changed from -1 to -99 in 3102f46
I would like the reordering of Value types to include this to be -1 again.

@adamretter adamretter force-pushed the hotfix/number-comparisons-2 branch from 1201eeb to 1af7f65 Compare February 14, 2023 11:03
private final static Int2ObjectOpenHashMap<String[]> typeNames = new Int2ObjectOpenHashMap<>(64, Hash.FAST_LOAD_FACTOR);
private final static Object2IntOpenHashMap<String> typeCodes = new Object2IntOpenHashMap<>(100, Hash.FAST_LOAD_FACTOR);
// Tombstone value used for FastUtil Int maps
private static final int NO_SUCH_VALUE = -99;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adamretter why is this value -99 in this case? I this a related to the FastUtil API?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the tombstone value be -1 as well?

Copy link
Member

@line-o line-o Feb 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tested this. There is no technical reason to leave it at -99 the tests all pass with -1 as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, there is no technical reason why it is -1 or -99, technically it makes no difference. To be clear, I have never claimed otherwise. The reason it was changed was for human reasons:

  1. When looking at it as an integer value it is much easier to spot.
  2. When looking at it as an unsigned binary value the wrap causes it to look similar to other valid values, whereas -99 was an easier pattern to spot.

Both of these were factors as to why I changed it whilst developing and debugging this code. As it technically makes no difference, I suggest leaving it as -99 to assist the humans that will likely have to work on this code in future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not convinced -1 is very different from the other values as it is the only negative value.

Copy link
Contributor Author

@adamretter adamretter Feb 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the person who actually did the work on this and spent time debugging and fixing it when there were issues, I would hope that you would believe me when I am trying to tell you that I had problems spotting the -1 amongst other values, and that switching it to -99 made it easier to find the issues! As you did not do this work yourself, and did not have that experience personally, I find your comments perplexing!

As we agree that there is no technical advantage to changing the value, but that I have explained that there is a very real human advantage to leaving it as -99, please can you explain what your argument is for changing it to -1, i.e. What advantage do you see in changing this that causes you to be so insistent about it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets make it Integer.MIN_VALUE then

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what reason exactly?

@line-o
Copy link
Member

line-o commented Feb 14, 2023

I would propose to shift the current mapping of node types by one which would make the relationship quite clear as all
types that are subtypes of NODE would have a higher integer value;

    public static final int NODE = 1;
    public final static int ELEMENT = 2;
    public final static int ATTRIBUTE = 3;
    public final static int TEXT = 4;
    public final static int PROCESSING_INSTRUCTION = 5;
    public final static int COMMENT = 6;
    public final static int DOCUMENT = 7;

Copy link
Member

@line-o line-o left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good, from a first look. Ignore earlier comment, as it did not show the new commits at the time of writing.

@adamretter
Copy link
Contributor Author

adamretter commented Feb 21, 2023

I would propose to shift the current mapping of node types by one which would make the relationship quite clear as all
types that are subtypes of NODE would have a higher integer value;

@line-o I am not sure if I understand what you are suggesting? At the moment in this PR I have:

    public final static int NODE = 56;
    public final static int ATTRIBUTE = 57;
    public final static int COMMENT = 58;
    public final static int DOCUMENT = 59;
    public final static int ELEMENT = 60;
    public final static int NAMESPACE = 61;
    public final static int PROCESSING_INSTRUCTION = 62;
    public final static int TEXT = 63;

It looks to me from the above that "all types that are subtypes of NODE" already "have a higher integer value" than NODE. Can you explain further what you were suggesting please?

@line-o
Copy link
Member

line-o commented Feb 21, 2023

The changes look good, from a first look. Ignore earlier comment, as it did not show the new commits at the time of writing.

@adamretter

Still wondering why -99 is so important. I will introduce a PR changing that if it is not done here.

@line-o line-o self-requested a review February 21, 2023 11:49
Copy link
Member

@line-o line-o left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think?

private final static Int2ObjectOpenHashMap<String[]> typeNames = new Int2ObjectOpenHashMap<>(64, Hash.FAST_LOAD_FACTOR);
private final static Object2IntOpenHashMap<String> typeCodes = new Object2IntOpenHashMap<>(100, Hash.FAST_LOAD_FACTOR);
// Tombstone value used for FastUtil Int maps
private static final int NO_SUCH_VALUE = -99;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets make it Integer.MIN_VALUE then

@reinhapa
Copy link
Member

@adamretter seem to have some license header problem, that needs to be fixed first

@reinhapa
Copy link
Member

@adamretter there seem to be some license header problems left

@line-o line-o self-requested a review February 28, 2023 15:26
@line-o
Copy link
Member

line-o commented Feb 28, 2023

I checked the unsigned binary values of

  • -1 : 1111111111111111
  • -99 : 1111111110011101
  • -2147483648 : 10000000000000000000000000000000

@dizzzz
Copy link
Member

dizzzz commented Nov 20, 2023

Needed

  • rebase
  • reformat
  • check tests

@dizzzz
Copy link
Member

dizzzz commented Apr 24, 2024

I'd like to see the PR pulled in as-is, the discussion is interesting (..) but not so important I'd say. Waste of time/energy....

@adamretter adamretter force-pushed the hotfix/number-comparisons-2 branch from baaf242 to affb1fd Compare September 2, 2024 10:02
@adamretter
Copy link
Contributor Author

@dizzzz I have rebased it as you requested. It think it is now ready for you to merge.

@adamretter adamretter force-pushed the hotfix/number-comparisons-2 branch 2 times, most recently from 3fde20d to d82ddf8 Compare September 2, 2024 10:05
Copy link
Member

@reinhapa reinhapa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. The tests do not pass at all
  2. Why where the shorter switch expressions being replaced with the old style?

Comment on lines +401 to +419
switch (requiredType) {
case Type.ANY_ATOMIC_TYPE:
case Type.ITEM:
case Type.STRING:
return StringValue.EMPTY_STRING;
case Type.NORMALIZED_STRING:
case Type.TOKEN:
case Type.LANGUAGE:
case Type.NMTOKEN:
case Type.NAME:
case Type.NCNAME:
case Type.ID:
case Type.IDREF:
case Type.ENTITY:
return new StringValue(getExpression(), "", requiredType);
case Type.ANY_URI:
return AnyURIValue.EMPTY_URI;
case Type.BOOLEAN:
return BooleanValue.FALSE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restore previous shorter switch version

Comment on lines +469 to +471
default:
throw new XPathException(getExpression(), "cannot convert empty value to " + requiredType);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use shorter switch style if possible

Comment on lines +72 to +94
public AtomicValue convertTo(final int requiredType) throws XPathException {
switch (requiredType) {
case Type.BOOLEAN:
case Type.ANY_ATOMIC_TYPE:
case Type.ITEM:
return this;
case Type.NUMERIC:
case Type.INTEGER:
return new IntegerValue(getExpression(), value ? 1 : 0);
case Type.DECIMAL:
return new DecimalValue(getExpression(), value ? 1 : 0);
case Type.FLOAT:
return new FloatValue(getExpression(), value ? 1 : 0);
case Type.DOUBLE:
return new DoubleValue(getExpression(), value ? 1 : 0);
case Type.STRING:
return new StringValue(getExpression(), getStringValue());
case Type.UNTYPED_ATOMIC:
return new UntypedAtomicValue(getExpression(), getStringValue());
default:
throw new XPathException(getExpression(), ErrorCodes.XPTY0004,
"cannot convert 'xs:boolean(" + value + ")' to " + Type.getTypeName(requiredType));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restore previous shorter switch version

Comment on lines +148 to +178
public AtomicValue convertTo(final int requiredType) throws XPathException {
switch (requiredType) {
case Type.DATE_TIME:
case Type.ANY_ATOMIC_TYPE:
case Type.ITEM:
return this;
case Type.DATE_TIME_STAMP:
return new DateTimeStampValue(getExpression(), calendar);
case Type.DATE:
return new DateValue(getExpression(), calendar);
case Type.TIME:
return new TimeValue(getExpression(), calendar);
case Type.G_YEAR:
return new GYearValue(getExpression(), calendar);
case Type.G_YEAR_MONTH:
return new GYearMonthValue(getExpression(), calendar);
case Type.G_MONTH_DAY:
return new GMonthDayValue(getExpression(), calendar);
case Type.G_DAY:
return new GDayValue(getExpression(), calendar);
case Type.G_MONTH:
return new GMonthValue(getExpression(), calendar);
case Type.STRING:
return new StringValue(getExpression(), getStringValue());
case Type.UNTYPED_ATOMIC:
return new UntypedAtomicValue(getExpression(), getStringValue());
default:
throw new XPathException(getExpression(), ErrorCodes.FORG0001,
"Type error: cannot cast xs:dateTime to "
+ Type.getTypeName(requiredType));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restore previous shorter switch version

Comment on lines +98 to +128
public AtomicValue convertTo(final int requiredType) throws XPathException {
switch (requiredType) {
case Type.DATE:
case Type.ANY_ATOMIC_TYPE:
case Type.ITEM:
return this;
case Type.DATE_TIME:
return new DateTimeValue(getExpression(), calendar);
case Type.G_YEAR:
return new GYearValue(getExpression(), this.calendar);
case Type.G_YEAR_MONTH:
return new GYearMonthValue(getExpression(), calendar);
case Type.G_MONTH_DAY:
return new GMonthDayValue(getExpression(), calendar);
case Type.G_DAY:
return new GDayValue(getExpression(), calendar);
case Type.G_MONTH:
return new GMonthValue(getExpression(), calendar);
case Type.UNTYPED_ATOMIC: {
final DateValue dv = new DateValue(getExpression(), getStringValue());
yield new UntypedAtomicValue(getExpression(), dv.getStringValue());
return new UntypedAtomicValue(getExpression(), dv.getStringValue());
}
case Type.STRING -> {
case Type.STRING: {
final DateValue dv = new DateValue(getExpression(), calendar);
yield new StringValue(getExpression(), dv.getStringValue());
return new StringValue(getExpression(), dv.getStringValue());
}
default -> throw new XPathException(getExpression(), ErrorCodes.FORG0001, "can not convert " +
Type.getTypeName(getType()) + "('" + getStringValue() + "') to " +
Type.getTypeName(requiredType));
};
default:
throw new XPathException(getExpression(), ErrorCodes.FORG0001, "can not convert " +
Type.getTypeName(getType()) + "('" + getStringValue() + "') to " +
Type.getTypeName(requiredType));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restore previous shorter switch version

Comment on lines +80 to +92
switch (requiredType) {
case Type.G_YEAR_MONTH:
case Type.ANY_ATOMIC_TYPE:
case Type.ITEM:
return new GYearMonthValue(getExpression(), this.calendar);
case Type.STRING:
return new StringValue(getExpression(), getStringValue());
case Type.UNTYPED_ATOMIC:
return new UntypedAtomicValue(getExpression(), getStringValue());
default:
throw new XPathException(getExpression(), ErrorCodes.FORG0001,
"Type error: cannot cast xs:time to "
+ Type.getTypeName(requiredType));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restore previous shorter switch version

Comment on lines +79 to +92
switch (requiredType) {
case Type.G_YEAR:
case Type.ANY_ATOMIC_TYPE:
case Type.ITEM:
return this;
case Type.STRING:
return new StringValue(getExpression(), getStringValue());
case Type.UNTYPED_ATOMIC:
return new UntypedAtomicValue(getExpression(), getStringValue());
default:
throw new XPathException(getExpression(), ErrorCodes.FORG0001,
"Type error: cannot cast xs:time to "
+ Type.getTypeName(requiredType));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restore previous shorter switch version

Comment on lines +129 to +141
switch (requiredType) {
case Type.ANY_ATOMIC_TYPE:
case Type.ITEM:
case Type.QNAME:
return this;
case Type.STRING:
return new StringValue(getExpression(), getStringValue());
case Type.UNTYPED_ATOMIC:
return new UntypedAtomicValue(getExpression(), getStringValue());
default:
throw new XPathException(getExpression(), ErrorCodes.FORG0001,
"A QName cannot be converted to " + Type.getTypeName(requiredType));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restore previous shorter switch version

Comment on lines +99 to +114
switch (requiredType) {
case Type.TIME:
case Type.ANY_ATOMIC_TYPE:
case Type.ITEM:
return this;
// case Type.DATE_TIME :
// xs:time -> xs:dateTime conversion not defined in Funcs&Ops 17.1.5
case Type.STRING -> new StringValue(getExpression(), getStringValue());
case Type.UNTYPED_ATOMIC -> new UntypedAtomicValue(getExpression(), getStringValue());
default -> throw new XPathException(getExpression(), ErrorCodes.FORG0001,
"Type error: cannot cast xs:time to "
+ Type.getTypeName(requiredType));
};
case Type.STRING:
return new StringValue(getExpression(), getStringValue());
case Type.UNTYPED_ATOMIC:
return new UntypedAtomicValue(getExpression(), getStringValue());
default:
throw new XPathException(getExpression(), ErrorCodes.FORG0001,
"Type error: cannot cast xs:time to "
+ Type.getTypeName(requiredType));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restore previous shorter switch version

Comment on lines +114 to +141
switch (requiredType) {
case Type.ITEM:
case Type.ANY_ATOMIC_TYPE:
case Type.YEAR_MONTH_DURATION:
return this;
case Type.STRING:
return new StringValue(getExpression(), getStringValue());
case Type.DURATION:
return new DurationValue(getExpression(), TimeUtils.getInstance().newDuration(
duration.getSign() >= 0,
(BigInteger) duration.getField(DatatypeConstants.YEARS),
(BigInteger) duration.getField(DatatypeConstants.MONTHS),
null, null, null, null
));
case Type.DAY_TIME_DURATION:
return new DayTimeDurationValue(getExpression(), DayTimeDurationValue.CANONICAL_ZERO_DURATION);
//case Type.DOUBLE:
//return new DoubleValue(monthsValueSigned().doubleValue());
//return new DoubleValue(Double.NaN);
//case Type.DECIMAL:
//return new DecimalValue(monthsValueSigned().doubleValue());
case Type.UNTYPED_ATOMIC -> new UntypedAtomicValue(getExpression(), getStringValue());
default -> throw new XPathException(getExpression(), ErrorCodes.XPTY0004,
"cannot cast 'xs:yearMonthDuration(\"" + getStringValue() +
"\")' to " + Type.getTypeName(requiredType));
};
case Type.UNTYPED_ATOMIC:
return new UntypedAtomicValue(getExpression(), getStringValue());
default:
throw new XPathException(getExpression(), ErrorCodes.XPTY0004,
"cannot cast 'xs:yearMonthDuration(\"" + getStringValue() +
"\")' to " + Type.getTypeName(requiredType));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restore previous shorter switch version

@adamretter adamretter force-pushed the hotfix/number-comparisons-2 branch 2 times, most recently from 2ef6a14 to 756462c Compare October 9, 2024 08:02
@reinhapa
Copy link
Member

reinhapa commented Oct 9, 2024

@adamretter there still testfailures

@adamretter adamretter force-pushed the hotfix/number-comparisons-2 branch from 7384f2d to ac6bc65 Compare October 9, 2024 10:06
@adamretter
Copy link
Contributor Author

there still testfailures
@reinhapa Yup - I am still tweaking it ;-)

@adamretter adamretter force-pushed the hotfix/number-comparisons-2 branch 2 times, most recently from f0733bc to 439307e Compare October 9, 2024 17:07
@adamretter
Copy link
Contributor Author

adamretter commented Oct 9, 2024

@reinhapa to answer your questions:

The tests do not pass at all

They should do now. I have split some things not related to Node/Item XDM Relationship out of this PR and will send them in a follow-up PR.

Why where the shorter switch expressions being replaced with the old style?

This is intentional, let me try and explain...

The "shorter switch expressions" (as you described them) are not readable when there are multiple values for the same case. The reason for this is they extend horizontally on a single line across the screen and make it impossible to understand easily or see the changes between two versions of the code. The "old style" (as you described them) instead is laid out on separate lines vertically and is much easier to read when you are considering what has changed between two or more versions.

Also when considering changes, consider that most diff tools are line based, and so putting it all onto one line stops a diff tool from clearly showing and/or highlighting small changes. For example here is a screenshot from a 2 version diff:

Screenshot 2024-10-09 at 19 13 50

Now consider when you have a 3 version diff as is common when working with Git versions and tools like diffmerge. It becomes almost impossible to understand the changes when you have one long line in 3 different versions that are compared side by side. For example here is a screenshot from a 3 version diffmerge:

Screenshot 2024-10-09 at 23 01 18

The "shorter switch expressions" are not bad per-se, but they are much much worse (i.e. less readable and understandable) by tools (like git, diff, or diffmerge, or humans) which are line based when you are comparing multiple versions. The reason for that is that when there are multiple case values they are laid out horizontally all on one line and that line often extends far off the viewable area on the screen.

Does that make sense?

Copy link
Member

@dizzzz dizzzz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (though I had one or 2 questions)

@dizzzz dizzzz merged commit 4fcb864 into eXist-db:develop Oct 10, 2024
12 of 13 checks passed
@dizzzz
Copy link
Member

dizzzz commented Oct 10, 2024

thnx!

@reinhapa
Copy link
Member

@reinhapa to answer your questions:
This is intentional, let me try and explain...

...

The "shorter switch expressions" are not bad per-se, but they are much much worse (i.e. less readable and understandable) by tools (like git, diff, or diffmerge, or humans) which are line based when you are comparing multiple versions. The reason for that is that when there are multiple case values they are laid out horizontally all on one line and that line often extends far off the viewable area on the screen.

Does that make sense?

I see your point of the problem when comparing. But this could be circumvented by still putting each value on on a seperate line but leave them on the same case label if realy needed but gaining the security of no fall-thru errors. Besides that it would be cool to have those types as Enum anyway in the future as a missing value would be detected already at compile time.

@adamretter
Copy link
Contributor Author

would be cool to have those types as Enum anyway

Yes it would, I already did that in another branch on another project ;-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement new features, suggestions, etc.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants