-
-
Notifications
You must be signed in to change notification settings - Fork 2.8k
package manager: filter unpack errors on paths excluded by manifest #19500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thanks for doing this other approach. I think it's a good sign that @ianprime0509's feedback from #19324 no longer applies.
Nice work creating these test cases. The dependencies are indeed a problem for upstreaming them. I think it would be best to maintain this test suite separately. I have done something similar with https://github.com/ziglang/contrib-testing - perhaps that could be a place for these extra tests to live? |
9a8f006
to
c22ff8f
Compare
I removed that complicated test from Fetch.zig, moving it to the zig-fetch-test. |
c22ff8f
to
262f995
Compare
Rebasing to the latest changes in package manager. unpackResource now returns UnpackResult structure ( |
Based on file content. Detects elf magic header or shebang line. Executable bit is ignored in hash calculation, as it was before this. So packages hashes are not changed. Reference: ziglang#17463 (comment) Fixes: 17463 Test is here: https://github.com/ianic/zig-fetch-test/blob/7c4600d7bb263f9b72fe3d0b70071f42be89e25c/src/main.zig#L307 (if ziglang#19500 got accepted I'll move this test to the Fetch.zig)
Based on file content. Detects elf magic header or shebang line. Executable bit is ignored in hash calculation, as it was before this. So packages hashes are not changed. Reference: #17463 (comment) Fixes: 17463 Test is here: https://github.com/ianic/zig-fetch-test/blob/7c4600d7bb263f9b72fe3d0b70071f42be89e25c/src/main.zig#L307 (if #19500 got accepted I'll move this test to the Fetch.zig)
262f995
to
b7d1fea
Compare
To summarize changes in this PR:
Type of root_dir in Diagnostic changed from nullable to non-nullable with default empty string value.
Add unable_to_create_file to the diagnostic, like it is in tar diagnostic.
unpackTarball and unpackGitPack returns now UnpackResult. They fill UnpackResult with diagnostic errors instead of raising them. Those errors are then filtered for excluded paths in runResource when we have manifest. If there are remaining errors after filtering they are then raised with UnpackResult.bundleErrors. The rest of changes in Fetch.zig are added UnpackResult structure and tests. Test case files are in Fetch/testdata folder. Test embeds one test file, creates temporary directory, saves embedded file to the temporary directory, initializes Fetch with TestFetchBuilder, runs Fetch on the saved file (cache directory is also positioned in the temporary directory), analyzes cache directory output or fetch errors.
Expose Fetch test so we can test it with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do my suggestions make sense?
src/Package/Fetch.zig
Outdated
const pkg_dir = try unpackResource(f, resource, uri_path, tmp_directory); | ||
// Fetch and unpack a resource into a temporary directory. | ||
var unpack_result = try unpackResource(f, resource, uri_path, tmp_directory); | ||
defer unpack_result.deinit(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use the arena allocator for this data so that this deinit() is not needed. The general-purpose allocator causes contention between threads, whereas each thread has its own arena allocator. It also simplifies memory management to use the arena allocator.
src/Package/Fetch.zig
Outdated
try unpack_result.filterErrors(filter); | ||
if (unpack_result.hasErrors()) { | ||
try unpack_result.bundleErrors(eb, try f.srcLoc(f.location_tok)); | ||
return error.FetchFailed; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try unpack_result.filterErrors(filter); | |
if (unpack_result.hasErrors()) { | |
try unpack_result.bundleErrors(eb, try f.srcLoc(f.location_tok)); | |
return error.FetchFailed; | |
} | |
try unpack_result.validate(eb, f.location_tok, filter); |
Request to combine all that logic into a validate
function.
src/Package/Fetch.zig
Outdated
allocator: std.mem.Allocator, | ||
errors: std.ArrayListUnmanaged(Error) = .{}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
allocator: std.mem.Allocator, | |
errors: std.ArrayListUnmanaged(Error) = .{}, | |
errors: []Error, |
src/Package/Fetch.zig
Outdated
fn free(self: Error, allocator: std.mem.Allocator) void { | ||
switch (self) { | ||
.unable_to_create_sym_link => |info| { | ||
allocator.free(info.file_name); | ||
allocator.free(info.link_name); | ||
}, | ||
.unable_to_create_file => |info| { | ||
allocator.free(info.file_name); | ||
}, | ||
.unsupported_file_type => |info| { | ||
allocator.free(info.file_name); | ||
}, | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fn free(self: Error, allocator: std.mem.Allocator) void { | |
switch (self) { | |
.unable_to_create_sym_link => |info| { | |
allocator.free(info.file_name); | |
allocator.free(info.link_name); | |
}, | |
.unable_to_create_file => |info| { | |
allocator.free(info.file_name); | |
}, | |
.unsupported_file_type => |info| { | |
allocator.free(info.file_name); | |
}, | |
} | |
} |
src/Package/Fetch.zig
Outdated
fn deinit(self: *UnpackResult) void { | ||
for (self.errors.items) |item| { | ||
item.free(self.allocator); | ||
} | ||
self.errors.deinit(self.allocator); | ||
self.allocator.free(self.root_error_message); | ||
self.allocator.free(self.root_dir); | ||
self.* = undefined; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fn deinit(self: *UnpackResult) void { | |
for (self.errors.items) |item| { | |
item.free(self.allocator); | |
} | |
self.errors.deinit(self.allocator); | |
self.allocator.free(self.root_error_message); | |
self.allocator.free(self.root_dir); | |
self.* = undefined; | |
} |
Test that UnpackResult prints same error output as diagnostic.
Test that we are still outputing same errors.
Report only errors which are not filtered by paths in build.zig.zon.
On case insensitive file systems, don't overwrite files with same name in different casing. Add diagnostic error so caller could decide what to do.
Using test cases from: https://github.com/ianprime0509/pathological-packages repository. Depends on existence of the FAT32 file system. Folder is in FAT32 file system because it is case insensitive and and does not support symlinks. It is complicated test case requires internet connection, depends on existence of FAT32 in the specific location. But it is so valuable for development. Running `zig test Package.zig` is so much faster than building zig binary and running `zig fetch URL`. Committing it here although it should probably be removed.
Should include folder structure, at least root folder so it can be found in pipeToFileSystem.
To be consistent with paths in manifest.
Make it consistent with Cache.Path sub_path. Remove null check in many locations.
Prepare test cases, store them in Fetch/testdata. They cover changes in this PR as well from previous one ziglang#19111.
Use stripRoot in less places. Strip it while copying error from diagnostic to unpack result so other palaces can be free of this logic.
Reference: ziglang#19500 (comment) Arena is now used for Diagnostic (tar and git). `deinit` is not called on Diagnostic allowing us to reference strings from Diagnostic in UnpackResult without dupe. That seamed reasonable to me. Instead of using gpa for Diagnostic, and then dupe to arena. Or using arena for both and making dupe so we can deinit Diagnostic.
b7d1fea
to
4151e6c
Compare
I switch to arena both UnpackResult and Diagnostic (tar and git). That seemed reasonable to me. Instead of using gpa for Diagnostic, and then dupe to arena. Or using arena for both and making dupe so we can deinit Diagnostic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work. Thanks for all the follow-ups.
fn validate(self: *UnpackResult, f: *Fetch, filter: Filter) !void { | ||
self.filterErrors(filter); | ||
if (self.hasErrors()) { | ||
const eb = &f.error_bundle; | ||
try self.bundleErrors(eb, try f.srcLoc(f.location_tok)); | ||
return error.FetchFailed; | ||
} | ||
} | ||
|
||
// Filter errors by manifest inclusion rules. | ||
fn filterErrors(self: *UnpackResult, filter: Filter) void { | ||
var i = self.errors_count; | ||
while (i > 0) { | ||
i -= 1; | ||
if (self.errors[i].excluded(filter)) { | ||
self.errors_count -= 1; | ||
const tmp = self.errors[i]; | ||
self.errors[i] = self.errors[self.errors_count]; | ||
self.errors[self.errors_count] = tmp; | ||
} | ||
} | ||
} | ||
|
||
// Emmit errors to an `ErrorBundle`. | ||
fn bundleErrors( | ||
self: *UnpackResult, | ||
eb: *ErrorBundle.Wip, | ||
src_loc: ErrorBundle.SourceLocationIndex, | ||
) !void { | ||
if (self.errors_count == 0 and self.root_error_message.len == 0) | ||
return; | ||
|
||
const notes_len: u32 = @intCast(self.errors_count); | ||
try eb.addRootErrorMessage(.{ | ||
.msg = try eb.addString(self.root_error_message), | ||
.src_loc = src_loc, | ||
.notes_len = notes_len, | ||
}); | ||
const notes_start = try eb.reserveNotes(notes_len); | ||
for (self.errors, notes_start..) |item, note_i| { | ||
switch (item) { | ||
.unable_to_create_sym_link => |info| { | ||
eb.extra.items[note_i] = @intFromEnum(try eb.addErrorMessage(.{ | ||
.msg = try eb.printString("unable to create symlink from '{s}' to '{s}': {s}", .{ | ||
info.file_name, info.link_name, @errorName(info.code), | ||
}), | ||
})); | ||
}, | ||
.unable_to_create_file => |info| { | ||
eb.extra.items[note_i] = @intFromEnum(try eb.addErrorMessage(.{ | ||
.msg = try eb.printString("unable to create file '{s}': {s}", .{ | ||
info.file_name, @errorName(info.code), | ||
}), | ||
})); | ||
}, | ||
.unsupported_file_type => |info| { | ||
eb.extra.items[note_i] = @intFromEnum(try eb.addErrorMessage(.{ | ||
.msg = try eb.printString("file '{s}' has unsupported type '{c}'", .{ | ||
info.file_name, info.file_type, | ||
}), | ||
})); | ||
}, | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking that all of this would then be combined into one function now, eliminating the need to mutate the errors array in filterErrors. However, I think this is good enough to merge, so I'll leave you with this idea to implement if you like it.
Follow up of: ziglang#19500 [discussion](ziglang#19500 (comment))
Follow up of: #19500 [discussion](#19500 (comment))
What is the suggested workflow for updating build.zig.zon in these tarballs? Particularly the ones that have duplicate files in them |
Sorry, but not sure that I understand question. I tried building zig-fetch-test with zig from newhash branch. Package with same file name with different case (file4, File4):
But, what I found that deeper dependencies can't be fetched any more. Root build.zig.zon has "mach" as dependency, and it is fetched fine. But mach has "mach-freetype" package dependency. That can't be fetched:
First level dependencies are fetched fine, but new zig requires changes in build.zig.zon for second level dependencies! |
Hmm, I don't know how that can happen, since .allow_missing_paths_field = true,
.allow_missing_nonce = true,
.allow_name_string = true, Edit: did you perhaps try with yesterday's version rather than today's version, which did not take into account @castholm's feedback yet? |
I'm using newhas branch, 5150b22 is last commit.
|
Parsing/validating the |
Mind the difference here. Requiring changes in my build.zig.zon when updating to latest compiler is different then requiring synchronous change in the whole tree of my dependencies. |
Fixes #18089, checks the remaining 2 items, and also fixes #17460.
The first item was to create diagnostic errors in git unpack, instead of overwriting files with the same name on case insensitive file system. Implemented here.
The second one was filtering tar/git unpack errors if they are on path not included by manifest (build.zig.zon). For that, I created UnpackResult which collects errors from tar\git diagnostic, allows filtering them by included paths and emits remaining errors to an
ErrorBundle
.Added tar tests where I create a tarball containing two files with the same name, simulating case insensitive file system, and then call Fetch.run with and without manifest which excludes those files. Can be run with
zig test src/Package.zig
.I used Ian's collection of pathological packages and his nice trick to simulate case insensitive, no symlink file system by mounting FAT32 file. I made a test around that. It depends on access to github repository, and local FAT32 mount, but it covers a lot of cases both for tar and git packages. Should I remove it because of all these dependencies? (it will be skipped if FAT32 file system is not found).
I also tested on Windows and macOS with different packages and the results are the same as described in previous try.