Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output is (too) biased #16

Open
briansmith opened this issue Oct 1, 2021 · 4 comments
Open

Output is (too) biased #16

briansmith opened this issue Oct 1, 2021 · 4 comments

Comments

@briansmith
Copy link

Please see rust-random/getrandom#228. I'd rather not going to repeat all the details here because I'd like to keep the conversation in one place. However, I also want you to be aware of this concern. Maybe I'm overlooking something.

@nagisa
Copy link
Owner

nagisa commented Oct 1, 2021

Yeah, this is unfortunate. My memory of this issue is that there's no good way to use CPU detection to filter out the CPUs that do the wrong thing (e.g. there were some bulldozers marketed as Ryzen 1000 series), but nowadays, I believe just considering anything before zen2 be broken would cover everything.

For systemd the x != 0 as a solution makes most sense mostly because it cares more about not blocking the boot process than it is about bias in the randomness. Everything else just copied that.

@nagisa
Copy link
Owner

nagisa commented Oct 3, 2021

Fixed in 5ef1921 I believe.

@nagisa nagisa closed this as completed Oct 3, 2021
@briansmith
Copy link
Author

Some comments on the fix:

  • The comment says "On AMD processor families < 0x17." I don't have enough direct information on the matter but I think BoringSSL excludes some 0x17 CPUs, as I noted in RDRAND-based output is (too) biased rust-random/getrandom#228 (comment). So I'm not sure this is conservative enough. See also RDRAND-based output is (too) biased rust-random/getrandom#228 (comment) where it was noted that other AMD CPUs may have issues (apparently, depending on the motherboard firmware).
  • There is some duplicate code that seems unfortunate, e.g. the duplicate blocks for filtering out bad AMD processors. This makes the change harder to read than if the duplicate logic were factored out in some way.
  • Since the fix wasn't done using the GitHub PR workflow, it's difficult to give line-by-line commentary on the changes. (I used to make changes in a similar way and eventually I switched to the GitHub PR workflow for even my own changes, and I think it actually improved my workflow, although it seemed like a waste of time when I started.)

@nagisa
Copy link
Owner

nagisa commented Oct 4, 2021

The comment says "On AMD processor families < 0x17." I don't have enough direct information on the matter but I think BoringSSL excludes some 0x17 CPUs

As my previous comment indicates, I remember some Ryzen CPUs being affected, but those were not actually Zen, but rather an older family (16h). 17h model 0x7?s are Matisse which led me to recall https://www.phoronix.com/scan.php?page=news_item&px=AMD-Releases-Linux-Zen2-Fix and then a much more widely covered “Destiny 2 does not work” problem.

I do understand why BoringSSL would decide to filter out these CPUs given this information, but its pretty disappointing that there is no more conclusive information. e.g. I'm not entirely sure if it wasn't just rdrand always reporting a failure to generate a random number (and the game/systemd trying to infinitely generate one regardless). At the very least a lack of a linux kernel workaround for this problem would indicate to me that it isn't serious at all… Not quite sure what the best way forward here is.


My reading of the other systemd issue (systemd/systemd#18184) is that the issue is not actually there except for a motherboard with a beta release of firmware for it. I'm not sure how feasible it is for us to meaningfully safeguard against users using non-production firmware for security sensitive workloads. After all, anybody can pick open a microcode blob and do something weird to the rdrand related code therein if they really wanted to.


There is some duplicate code that seems unfortunate, e.g. the duplicate blocks for filtering out bad AMD processors. This makes the change harder to read than if the duplicate logic were factored out in some way.

I'll think about a better approach here.


Since the fix wasn't done using the GitHub PR workflow, it's difficult to give line-by-line commentary on the changes. (I used to make changes in a similar way and eventually I switched to the GitHub PR workflow for even my own changes, and I think it actually improved my workflow, although it seemed like a waste of time when I started.)

Point taken.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants