-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lr_checksum_fd_compare() does not return cached calculated value #233
Labels
Triaged
Someone on the DNF 5 team has read the issue and determined the next steps to take
Comments
malmond77
added a commit
to malmond77/librepo
that referenced
this issue
Mar 11, 2021
If a file is downloaded via librepo (e.g. `dnf install --downloadonly`) then a request to get the checksum via `lr_checksum_fd_compare()` will not work. It'll only return whether the checksum is valid, and not the actual checksum. This is the simple fix. Addresses rpm-software-management#233
malmond77
added a commit
to malmond77/libdnf
that referenced
this issue
Mar 11, 2021
DNF has been carrying around yum's old checksum function. These functions duplicate code in librepo. They are slower because librepo can employ caching of digests. Lastly, these functions in Python do not know about changes in checksum logic like rpm-software-management/librepo#222 The choices here are: 1. Replicate `lr_checksum_cow_fd()` and caching logic in Python 2. Just use librepo from dnf. This is 2. Note there is an open bug in librepo that forces no caching for `checksum_value()` (rpm-software-management/librepo#233). For now we can keep the perf hit - this is no worse than the existing code and aim for more correctness. This change goes hand in hand with a change to `dnf` itself to make use of the new functions and eliminate the old ones.
malmond77
added a commit
to malmond77/libdnf
that referenced
this issue
Mar 11, 2021
DNF has been carrying around yum's old checksum function. These functions duplicate code in librepo. They are slower because librepo can employ caching of digests. Lastly, these functions in Python do not know about changes in checksum logic like rpm-software-management/librepo#222 The choices here are: 1. Replicate `lr_checksum_cow_fd()` and caching logic in Python 2. Just use librepo from dnf. This is 2. Note there is an open bug in librepo that forces no caching for `checksum_value()` (rpm-software-management/librepo#233). For now we can keep the perf hit - this is no worse than the existing code and aim for more correctness. This change goes hand in hand with a change to `dnf` itself to make use of the new functions and eliminate the old ones. On errors, these functions raise libdnf.error.Error which can be easily mapped into MiscError in dnf
malmond77
added a commit
to malmond77/dnf
that referenced
this issue
Mar 11, 2021
libdnf has the canonical implementation of checksum handling. We aim to replace all use of dnf.yum.misc.checksum() with this. In doing so, this fixes installing previously downloaded and transcoded rpms to support rpm-software-management/librepo#222 This also has some minor performance benefits: librepo's checksum handling employs caching of previously downloaded files via extended attributes. This works for ordinary rpms in the dnf cache, but does not work (yet) for rpm paths specified on the command line due to rpm-software-management/librepo#233. That issue is pretty minor, and the fix ends up in libdnf later. The previous implementation maps all runtime errors to MiscError. We do this still by taking the libdnf.error.Error class (defined in SWIG) and map it directly back to the Python exception as before.
malmond77
added a commit
to malmond77/dnf
that referenced
this issue
Mar 11, 2021
libdnf has the canonical implementation of checksum handling. We aim to replace all use of dnf.yum.misc.checksum() with this. In doing so, this fixes installing previously downloaded and transcoded rpms to support rpm-software-management/librepo#222 This also has some minor performance benefits: librepo's checksum handling employs caching of previously downloaded files via extended attributes. This works for ordinary rpms in the dnf cache, but does not work (yet) for rpm paths specified on the command line due to rpm-software-management/librepo#233. That issue is pretty minor, and the fix ends up in libdnf later. The previous implementation maps all runtime errors to MiscError. We do this still by taking the libdnf.error.Error class (defined in SWIG) and map it directly back to the Python exception as before.
malmond77
added a commit
to malmond77/librepo
that referenced
this issue
Mar 12, 2021
If a file is downloaded via librepo (e.g. `dnf install --downloadonly`) then a request to get the checksum via `lr_checksum_fd_compare()` will not work. It'll only return whether the checksum is valid, and not the actual checksum. This is the simple fix. Addresses rpm-software-management#233
malmond77
added a commit
to malmond77/dnf
that referenced
this issue
Mar 12, 2021
libdnf has the canonical implementation of checksum handling. We aim to replace all use of dnf.yum.misc.checksum() with this. In doing so, this fixes installing previously downloaded and transcoded rpms to support rpm-software-management/librepo#222 This also has some minor performance benefits: librepo's checksum handling employs caching of previously downloaded files via extended attributes. This works for ordinary rpms in the dnf cache, but does not work (yet) for rpm paths specified on the command line due to rpm-software-management/librepo#233. That issue is pretty minor, and the fix ends up in libdnf later. The previous implementation maps all runtime errors to MiscError. We do this still by taking the libdnf.error.Error class (defined in SWIG) and map it directly back to the Python exception as before.
malmond77
added a commit
to malmond77/libdnf
that referenced
this issue
Mar 12, 2021
DNF has been carrying around yum's old checksum function. These functions duplicate code in librepo. They are slower because librepo can employ caching of digests. Lastly, these functions in Python do not know about changes in checksum logic like rpm-software-management/librepo#222 The choices here are: 1. Replicate `lr_checksum_cow_fd()` and caching logic in Python 2. Just use librepo from dnf. This is 2. Note there is an open bug in librepo that forces no caching for `checksum_value()` (rpm-software-management/librepo#233). For now we can keep the perf hit - this is no worse than the existing code and aim for more correctness. This change goes hand in hand with a change to `dnf` itself to make use of the new functions and eliminate the old ones. On errors, these functions raise libdnf.error.Error which can be easily mapped into MiscError in dnf
m-blaha
added
the
Triaged
Someone on the DNF 5 team has read the issue and determined the next steps to take
label
Mar 15, 2021
malmond77
added a commit
to malmond77/librepo
that referenced
this issue
Mar 15, 2021
If a file is downloaded via librepo (e.g. `dnf install --downloadonly`) then a request to get the checksum via `lr_checksum_fd_compare()` will not work. It'll only return whether the checksum is valid, and not the actual checksum. This is the simple fix. Addresses rpm-software-management#233
malmond77
added a commit
to malmond77/librepo
that referenced
this issue
Mar 17, 2021
If a file is downloaded via librepo (e.g. `dnf install --downloadonly`) then a request to get the checksum via `lr_checksum_fd_compare()` will not work. It'll only return whether the checksum is valid, and not the actual checksum. This is the simple fix. Addresses rpm-software-management#233
m-blaha
pushed a commit
that referenced
this issue
Mar 17, 2021
If a file is downloaded via librepo (e.g. `dnf install --downloadonly`) then a request to get the checksum via `lr_checksum_fd_compare()` will not work. It'll only return whether the checksum is valid, and not the actual checksum. This is the simple fix. Addresses #233
Fix is merged. I will update rpm-software-management/libdnf#1164 to explicitly depend on v1.13.1 which has the fix so I can reliably pass TRUE for caching in |
malmond77
added a commit
to malmond77/libdnf
that referenced
this issue
Mar 17, 2021
DNF has been carrying around yum's old checksum function. These functions duplicate code in librepo. They are slower because librepo can employ caching of digests. Lastly, these functions in Python do not know about changes in checksum logic like rpm-software-management/librepo#222 The choices here are: 1. Replicate `lr_checksum_cow_fd()` and caching logic in Python 2. Just use librepo from dnf. This is 2. Note there was bug in librepo that forces no caching for `checksum_value()` (rpm-software-management/librepo#233). This is now fixed, so we depend on librepo-1.13.1 to ensure this fix is present. This change goes hand in hand with a change to `dnf` itself to make use of the new functions and eliminate the old ones. This is rpm-software-management/dnf#1743. We bump the version of this library to ensure this dependency is expressed properly. On errors, these functions raise libdnf.error.Error which can be easily mapped into MiscError in dnf
malmond77
added a commit
to malmond77/libdnf
that referenced
this issue
Mar 17, 2021
DNF has been carrying around yum's old checksum function. These functions duplicate code in librepo. They are slower because librepo can employ caching of digests. Lastly, these functions in Python do not know about changes in checksum logic like rpm-software-management/librepo#222 The choices here are: 1. Replicate `lr_checksum_cow_fd()` and caching logic in Python 2. Just use librepo from dnf. This is 2. Note there was bug in librepo that forces no caching for `checksum_value()` (rpm-software-management/librepo#233). This is now fixed, so we depend on librepo-1.13.1 to ensure this fix is present. This change goes hand in hand with a change to `dnf` itself to make use of the new functions and eliminate the old ones. This is rpm-software-management/dnf#1743. We bump the version of this library in the next commit to ensure this dependency is expressed properly. On errors, these functions raise libdnf.error.Error which can be easily mapped into MiscError in dnf
malmond77
added a commit
to malmond77/libdnf
that referenced
this issue
Mar 17, 2021
DNF has been carrying around yum's old checksum function. These functions duplicate code in librepo. They are slower because librepo can employ caching of digests. Lastly, these functions in Python do not know about changes in checksum logic like rpm-software-management/librepo#222 The choices here are: 1. Replicate `lr_checksum_cow_fd()` and caching logic in Python 2. Just use librepo from dnf. This is 2. Note there was bug in librepo that forces no caching for `checksum_value()` (rpm-software-management/librepo#233). This is now fixed, so we depend on librepo-1.13.1 to ensure this fix is present. This change goes hand in hand with a change to `dnf` itself to make use of the new functions and eliminate the old ones. This is rpm-software-management/dnf#1743. We bump the version of this library in the next commit to ensure this dependency is expressed properly. On errors, these functions raise libdnf.error.Error which can be easily mapped into MiscError in dnf
m-blaha
pushed a commit
to rpm-software-management/libdnf
that referenced
this issue
Mar 18, 2021
DNF has been carrying around yum's old checksum function. These functions duplicate code in librepo. They are slower because librepo can employ caching of digests. Lastly, these functions in Python do not know about changes in checksum logic like rpm-software-management/librepo#222 The choices here are: 1. Replicate `lr_checksum_cow_fd()` and caching logic in Python 2. Just use librepo from dnf. This is 2. Note there was bug in librepo that forces no caching for `checksum_value()` (rpm-software-management/librepo#233). This is now fixed, so we depend on librepo-1.13.1 to ensure this fix is present. This change goes hand in hand with a change to `dnf` itself to make use of the new functions and eliminate the old ones. This is rpm-software-management/dnf#1743. We bump the version of this library in the next commit to ensure this dependency is expressed properly. On errors, these functions raise libdnf.error.Error which can be easily mapped into MiscError in dnf
m-blaha
pushed a commit
to rpm-software-management/dnf
that referenced
this issue
Mar 19, 2021
libdnf has the canonical implementation of checksum handling. We aim to replace all use of dnf.yum.misc.checksum() with this. In doing so, this fixes installing previously downloaded and transcoded rpms to support rpm-software-management/librepo#222 This also has some minor performance benefits: librepo's checksum handling employs caching of previously downloaded files via extended attributes. This works for ordinary rpms in the dnf cache, but does not work (yet) for rpm paths specified on the command line due to rpm-software-management/librepo#233. That issue is pretty minor, and the fix ends up in libdnf later. The previous implementation maps all runtime errors to MiscError. We do this still by taking the libdnf.error.Error class (defined in SWIG) and map it directly back to the Python exception as before.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Context: I'm working on https://fedoraproject.org/wiki/Changes/RPMCoW
I've found that DNF still uses some old yum code to produce and validate checksums of downloaded rpms (https://github.com/rpm-software-management/dnf/blob/master/dnf/yum/misc.py#L153). In the normal case: librepo downloads and validates rpm files using librepo. In the less common path, the rpm is already there, and dnf uses the python code linked to verify the checksum.
This works today because the implementation is close enough to not matter.
Enter: #222 . This augments checksum handling with specific comprehension of "transcoded" rpms. This works in the normal path, but fails with existing packages in cache because the python code is used, and it doesn't know about the transcoded .recorded checksum.
Solving this is fairly involved. There's changes to librepo, libdnf, and dnf needed to fix this. As listed: librepo is first, and I'd like to fix a latent bug in the API, or get guidance on how to better do it.
My thought is to make dnf use
lr_checksum_fd_compare()
andlr_checksum_fd_cmp()
instead ofyum.misc.checksum()
. Why both? Because in one case, we need to validate checksums (https://github.com/rpm-software-management/dnf/blob/master/dnf/package.py#L333) and another we need to just calculate it (https://github.com/rpm-software-management/dnf/blob/master/dnf/package.py#L59)The API for librepo doesn't explicitly support calculating checksums. The closest we get is is to call
lr_checksum_fd_compare()
and pass an empty string forexpected
, ignorematches
, and pass achar *
forcalculated
.If you call
lr_checksum_fd_compare()
withcaching = TRUE
, and an xattr is present, thenmatches
is set, butcalculated
is not set. This feels like a bug: if the caching mechanic is robust enough to use for the normal case, then it should be possible to set*calculated
too. Using caching shouldn't come with this drawback.For now, I'm going to call
lr_checksum_fd_compare()
in mylibdnf
SWIG wrapper withcaching = FALSE
to force a full checksum though librepo. I will reference this issue in my code, and one day, when the fix to this issue is available in downstream packages, I can enable caching for this less common path.The text was updated successfully, but these errors were encountered: