-
-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix PermissionDeniedException in NativeBroker.defragXMLresource #5311
base: develop
Are you sure you want to change the base?
Conversation
When XML resources are modified using an effective group because the setGID bit is set on the resource NativeBroker.defragXMLResource might throw a PermissionDeniedException. With this patch applied, this can no longer happen as the call to DocumentImpl#copyOf is removed in defragXMLresource. The temporary document is just a container to realign nodes on one BTree page and is destroyed within the method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likely needs a more nuanced fix
final DocumentImpl tempDoc = new DocumentImpl(null, doc.getDocId(), doc); | ||
tempDoc.copyOf(this, doc, doc); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think just removing this is the correct or secure solution. Can you write a bit more detail in the description of the PR about exactly what the problem was please, that might help me to suggest a better solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The commit message body has some more detail.
With this patch applied, this can no longer happen as the call to DocumentImpl#copyOf is removed in defragXMLresource.
The temporary document is just a container to realign nodes on one BTree page and is destroyed within the method.
I hope this clarifies the motivation and justification.
I would like to point out that we arrived at this solution together with @wolfgangmm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likely needs a more nuanced fix
Please only block PRs where you have either proof that it creates a problem or at least share the thoughts what the issue might be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please only block PRs where you have either proof that it creates a problem
Obviously!
or at least share the thoughts what the issue might be
I am concerned that the temporary document which looks persistent to me is not correctly initialised, also I am concerned that the security on it is now too relaxed, this could lead to a data leak/exploit.
Can you clearly describe the initial problem that you are trying to solve in the PR description please. For example, something like: There is an active user U
with the security profile s(U)
, there is document D
with the permissions p(D)
, when the action A
is executed by the subject AS
with security profile s(AS)
, it is rejected because s(AS)
is incompatible with p(D)
.
Obviously you need to replace the symbols above with the actual details.
hope this clarifies the motivation and justification.
It just discusses the solution you arrived at, it does not clearly describe the problem you were trying to solve.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was debugging this code with Juri yesterday and as the original author of the defrag method, I confirm that the offending line tempDoc.copyOf(this, doc, doc);
is not only unnecessary here, but can cause database corruption. The tempDoc
is only used as a temporary container for copying nodes during defrag and is thrown away afterwards! Original permissions and ownership on the resource are not affected by the defragXMLResource
operation in any way, because the tempDoc
is never written. What is written is the original DocumentImpl, which still has the proper permissions and ownership.
It seems I originally added the call to tempDoc.copyOf
19 years ago and it was preserved during a series of refactorings later though the implementation of DocumentImpl
had changed to the point that it now may actually cause a corruption when called in this case.
Please be aware that this issue is a ticking time bomb that will lead to data wipe out!!! Together with the even more fatal #5296 (because it's easier to run into) this is absolutely worth a hotfix, so I would give it a thumbs up, not excluding that more improvements could be done at a later point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @wolfgangmm thanks for your reply. It is interesting to hear about these symptoms, and it does sound like something needs to be fixed. However, the actual original issue and its cause remain unclear. I would like further information please so that we may all understand it...
but can cause database corruption
Could you explain how that is possible please? As I understood it from @line-o's description there was a PermissionDeniedException
which has not yet been explained. I am not sure how a PermissionDeniedException
, which is a genuine way of forbidding a subject from doing something they are not allowed to do, could cause a database corruption here?
Original permissions and ownership on the resource are not affected by the defragXMLResource operation in any way, because the tempDoc is never written.
I did not expect that they would be.
What is written is the original DocumentImpl, which still has the proper permissions and ownership.
Okay.
What I (and presumably others) don't yet understand, and what has not yet been explained, is how can the current approach that places the same permissions onto tempDoc
as the original resource that is to be defragmented, cause a PermissionDeniedException
(and then also if I now understand you, a subsequent database corruption)?
Please be aware that this issue is a ticking time bomb that will lead to data wipe out!!!
I agree that it sounds like it needs to be fixed. At the moment I don't have enough information to see if the proposed fix correctly addresses the cause. I trust you and @line-o will provide the requested information about the underlying issue and cause...
Together with the even more fatal #5296 (because it's easier to run into)
Let's consider that separately. As at the moment #5296 is blocked by work that is still needed to address issues introduced in #5276 which itself was rushed and merged in violation of the agreed community process for reviewing and merging PRs.
this is absolutely worth a hotfix,
From what you have said so far, I agree. We just need more information to be able to complete the review of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@line-o added some information about how to reproduce the issue. It requires a certain combination of permissions. The corruption occurs because the PermissionDeniedException aborts the defrag process after some steps have already been applied. Therefore another way to prevent this might be to move the offending code further up to the start of the method. We did not follow this route as dropping the line is the cleaner way to solve the issue.
The PermissionDeniedException is eventually thrown in line 664 of DocumentImpl#copyOf
:
permissions.setOwner(prev.get_1().getOwner());
why this happens I cannot explain entirely and it seems very weird that the owners would be different. It must be related to the setgid. But in itself it has nothing to do with the actual defrag routine and points to another bug in permission handling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that was my feeling from the start, which is why I wanted more information from yourselves. Based on your summation it seems to me that the this PR is a workaround and not a fix for the root issue.
Well, the root issue this PR tries to solve is the database corruption resulting from the defrag operation. The only correct resolution for this is to fix defrag, which we did. I strongly object against calling a fix a workaround.
Fixing the permission issue may help with other operations, but won't change this PR in any way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wolfgangmm I think you have accidentally deleted or overwritten my comments
@wolfgangmm Are you able to restore my comments please? If not, I can try and re-create them...
…Resource Testing has shown that the previous call to construct tempDoc was actually the correct one, as it is using the default file permissions of the pool. That is key for situations in which the subject initiating the defragmentation is not the _owner_ of the document and only belongs to the group with write permission.
Quality Gate failedFailed conditions |
@adamretter Please note that the test was adapted to address your review in #5273 and also note that the test was written based on src/test/java/org/exist/xquery/update/AbstractTestUpdate.java |
I don't agree. Your code fixed one thing you were interested in but also breaks another thing that you are not interested in. That is not how changes should be implemented in collaborative project IMHO. |
@adamretter What do you mean with this statement?
What did the code break and which code are you talking about, the test? |
Maybe let's look at this the other way around. What is the use of the call to |
@adamretter please watch your language, which I perceive as disrespectful in the last comments answering @line-o. We have no direct interest in this bugfix as we're not using any code affected by it. It's all just for the community's benefit. |
@wolfgangmm I will. No disrespect was or is intended. I'm sorry you felt that way. I have constantly written offers to help throughout this thread. I trust you have seen them? |
@adamretter good. @line-o submitted the PR with the best intentions and did a great job reproducing this rather puzzling edge case. In principle I agree the problem concerning permissions would be worth more investigation because it might also affect other copy operations. I was just arguing that it is not the correct solution for the corruption issue we try to address with this PR. The offending line was carried on over the years without checking if it still applies to the particular case of the defrag operation. As things are, it should clearly be dropped. I'll try to provide more details concerning the other, the permission problem as time allows. It should be straightforward to reproduce using the setup instructions @line-o provided. |
I have never questioned that, nor have I cast any doubt to the contrary. I am simply following a scientific method without emotion to: (a) understand the problem, (b) locate the root of the problem, (c) propose a hypothesis for a fix, and (d) test and prove such a hypotheis through experimental evidence.
@wolfgangmm Perhaps I am misunderstanding something... My understanding is:
Do I have that correct? Assuming that I have understood correctly, then I don't understand how this PR fixes the real issue. Removing the call to I see that there is a test added in this PR. If I was to run just that test without the changes in this PR against eXist-db 7.0.0-SNAPSHOT, would it show me the PermissionDeniedException? If so I can then use that to debug the underlying issue, and I will work forward on that basis. If not, then can you tell me what the test demonstrates? |
@adamretter I am wondering how to put two recent quotes from you together: last week in this PR:
By the way, you still haven't answered what I broke, allegedly. In Slack 3 days ago:
So I fixed a thing users in our community are interested in: exist-db's XQuery update extension. You clearly stated that you are not interested in it. So what is your vested interest here, as this change only affects defragmentation which can only ever run after modification? |
Yes I did... it's been in the review comments of your PR for weeks now - as discussed there, your newly added tests are leaking resources.
I don't really see what you are getting at here! My interest is advising eXist-db users as to the best way of using eXist-db. Your fix addresses an issue with defragmentation, it does not fix other issues with XQuery Update. |
OK, so then have a look at the test in this PR as it is changed to address your review.
But it does fix an existing, reproducable and severe issue with XQuery Update. Other issues, you hinted at, are out of scope. |
I'm not seeing that in this PR. |
Well it is the other changed file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
As it addresses a 'timebomb' I'd vote to pull the PR in ASAP.
No it isn't. It does not fully address my review and is still missing several things. |
I would need some more info on what you think is missing. |
The community agrees to have it in... |
Description:
Additional bugfix for NativeBroker.defragXMLresource where in specific setups with modules and collections with set group ID set a PermissionDeniedException is thrown when the modified XML resource is defragmented.
Example setup
-rwxrwx---
crwxrws---
-rwxr-sr-x
Before this patch is applied the defragXMLResource method attempted to change the owner of the tempDoc which it cannot do as the effective subject is guest.
Reference:
refs #5273
refs #5276
Type of tests:
Code review was addressed in UpdateInsertTriggersDefrag.java