-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build: handle symlinks in a centralized way #9800
Conversation
Hard fail the build if there is symlinks in the output.
Simplifies symlinks hanlding to centralize the security logic in one place only (`safe_open`); reducing the complexity of this code, making it easier to maintain and more secure. Besides, it improves the UX for users to understand why their build is failing. It clearly communicates the error using our internal pattern for known exception managed at `Celery.on_failure`. These are the highlights of this work: - Always follow symlinks - Always copy symlink as-is. Only resolve them when uploading to storage. - Always check symlinks are inside DOCROOT and `base_path` if passed - Hard fail the build if there are symlinks in the output that are outside DOCROOT - Centralize all the symlink checks in `safe_open` (used to open configuration files while building and to open files for uploading them) - Clearly communicate the error to the user - Move upload to storage function inside `Celery.execute` handler to allow us raising `UnsupportedSymlinkFileError` and using `on_failure` to clearly communicate the error to the user
# Don't follow symlinks when uploading to storage. | ||
if filepath.is_symlink(): | ||
log.info( | ||
"Skipping symlink upload.", | ||
path_resolved=str(filepath.resolve()), | ||
) | ||
continue | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should follow symlinks when uploading files (github pages doesn't for example), and with our plans to test rclone, rclone doesn't support something like "follow symlinks but only inside this path", and we also need to check that the symlinks aren't outside the version path not just the docroot path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remove this check from here to allow communicating this decision, "do not follow symlinks", to the user clearly. Otherwise, with this check we are just ignoring symlinks without telling this to the user, which lead to unexpected behavior from a user perspective.
# TODO: this `relative_path` can be improved to remove the first part: | ||
# /symlink-security-exploit/artifacts/latest/generic/passwd.txt | ||
# This is shown to the user currently. | ||
relative_path = str(path).replace(str(docroot.resolve()), '') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is the Path.relative_to method that should be used instead of replace.
) | ||
|
||
if path.is_symlink(): | ||
symlink_path = str(path.readlink().resolve()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resolve already follows symlinks, readlink isn't needed here.
from_dir = Path(from_dir) | ||
to_dir = Path(to_dir) | ||
if from_dir.is_symlink() or to_dir.is_symlink(): | ||
log.info( | ||
"Not copying directory, one of the paths is a symlink.", | ||
from_dir=from_dir, | ||
from_dir_resolved=from_dir.resolve(), | ||
to_dir=to_dir, | ||
to_dir_resolved=to_dir.resolve(), | ||
) | ||
return False | ||
|
||
_assert_path_is_inside_docroot(from_dir) | ||
_assert_path_is_inside_docroot(to_dir) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these checks are needed, since copytree will follow the symlink when doing the copy of the initial directory (there is a test for this as well), what it won't do is follow any symlinks inside that directory. And the check for _assert_path_is_inside_docroot is also useful, it will help us to prevent any directory traversal vulnerabilities being exploited.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copytree will follow the symlink when doing the copy of the initial directory (there is a test for this as well)
What is that initial directory you refer to here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The directory itself that you are going to copy (from_dir
)
resolved_path=path.resolve(), | ||
) | ||
return None | ||
_assert_path_is_inside_docroot(path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rmtree fails hard (OSError) when trying to delete a dir that points to a symlink, but _assert_path_is_inside_docroot
is still useful as mentioned in my other comment.
|
||
time_before_store_build_artifacts = timezone.now() | ||
# Store build artifacts to storage (local or cloud storage) | ||
self.store_build_artifacts( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should probably do this change in another PR, doesn't look related to symlinks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to move this call inside execute
since we cannot raise exceptions from on_success
and be able to manage them inside on_failure
since it's not called when the code is already inside on_success
.
I'm closing this PR. It was useful to show my proposal about symlinks. Besides, the part related to store build artifacts was already implemented and merged from another PR. |
Simplifies symlinks hanlding to centralize the security logic in one place only (
safe_open
); reducing the complexity of this code, making it easier to maintain and more secure.Besides, it improves the UX for users to understand why their build is failing. It clearly communicates the error using our internal pattern for known exception managed at
Celery.on_failure
.These are the highlights 🔆 of this work:
base_path
if passedDOCROOT
safe_open
(used to open configurationfiles while building and to open files for uploading them)
Celery.execute
handler to allow usraising
UnsupportedSymlinkFileError
and usingon_failure
to clearlycommunicate the error to the user
I'm having issues running the tests right now, but I'll come back to this PR to update them soon.