Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "Add TPM" VM action #1796

Merged
merged 4 commits into from
Nov 5, 2024
Merged

Conversation

Shotaro-Kawaguchi
Copy link
Contributor

@Shotaro-Kawaguchi Shotaro-Kawaguchi commented Aug 30, 2024

We implemented this feature because there was a use case where upgrading the Windows OS to Windows11 required additional TPM devices.

Selecting "Add TPM" will add a TPM device with default settings.
"Add TPM" is only displayed if no TPM device exists.


Machines: Action to add a TPM

This can help with upgrading old VMs to a newer guest OS that requires a TPM, such as Windows 11.

tpm_kebab

@martinpitt
Copy link
Member

Thanks @Shotaro-Kawaguchi ! I had a very cursory look. I'm mostly interested in the use case -- so far we've treated that as an implementation detail, and osinfo-db was telling libvirt whether the target OS needs a TPM (like Windows) or can use a TPM (EFI BIOS), and automatically adds an appropriate one. In which cases does that not work? Every bit of UI addition is code to maintain, increases combinatorial explosion of testing, and is a mental load to users, so we don't want to do that lightly.

This needs a round of design review and updates -- the dialogs are way too technical, and give the admin no guidance at all. When to use "tis" or "crb", and what's the difference? What's the (dis)advantage of emulator vs. passthrough? Is there any reason to use "1.2" when "2.0" is available? and so on. This needs some (i) popups with documentation links, and if possible some clearer descriptions.

Also this needs integration tests for all code paths. testConfigureBeforeInstall() has some cursory check that a software TPM was added, so our test infrastructure supports it. testConfigureBeforeInstallBiosTPM() also covers it explicitly.

Unfortunately it will take our team very long to help with that as we don't have a current machines maintainer. So it would be good if you could work on tests yourself? (Happy to guide, of course)

@martinpitt martinpitt marked this pull request as draft August 30, 2024 12:53
@martinpitt martinpitt added question Further information is requested needs-design labels Aug 30, 2024
@Shotaro-Kawaguchi
Copy link
Contributor Author

@martinpitt Thank you for your response.

Thanks @Shotaro-Kawaguchi ! I had a very cursory look. I'm mostly interested in the use case -- so far we've treated that as an implementation detail, and osinfo-db was telling libvirt whether the target OS needs a TPM (like Windows) or can use a TPM (EFI BIOS), and automatically adds an appropriate one. In which cases does that not work? Every bit of UI addition is code to maintain, increases combinatorial explosion of testing, and is a mental load to users, so we don't want to do that lightly.

A TPM device is required for Windows 11, but it is not needed for Windows 10 and older. Therefore, I believe that a TPM device will not be automatically added when a virtual machine is upgraded to Windows 11.
I think it is useful to be able to add a TPM device when upgrading from Windows 10 or older to Windows 11.

Also this needs integration tests for all code paths.

I will try to create the test code.

@martinpitt
Copy link
Member

I think it is useful to be able to add a TPM device when upgrading from Windows 10 or older to Windows 11.

OK, I see that. But I still don't believe that we should throw all these technicalities to the user. The dialog allows 8 combinations, which is too much both for the user to decide on (and document) and also for testing. This needs to be reduced, automatically pick the most appropriate one (virt-xml should help with that), and the remaining options better explained. But ideally there should not be any options at all.

Thanks!

@jelly
Copy link
Member

jelly commented Sep 5, 2024

For upgrading it seems sensible to add a TPM, I assume most users would want a swtpm (software tpm). Passing through a TPM sounds like extra hurdles (plus the host might use it?).

For UEFI, it seems virt-install these days already adds a TPM

https://github.com/virt-manager/virt-manager/blob/2da4884962d7a8d4e66e4d6f273a452fdd3f2d4f/virtinst/guest.py#L1081

@garrett
Copy link
Member

garrett commented Sep 9, 2024

Agreed; this should work by default and shouldn't have to expose any configuration UI.

Creating a VM in UEFI mode should ideally automatically create a software TPM compatible with Windows 11. (I'm not sure if it does. It seems like it might, according to @jelle's comment.) And if it's compatible with Windows 11, it should also be compatible with Linux too.

Are there any cases where having a software TPM that's compatible with Windows 11 isn't enough for either Windows or Linux?

@jelly
Copy link
Member

jelly commented Sep 9, 2024

Agreed; this should work by default and shouldn't have to expose any configuration UI.

Creating a VM in UEFI mode should ideally automatically create a software TPM compatible with Windows 11. (I'm not sure if it does. It seems like it might, according to @jelle's comment.) And if it's compatible with Windows 11, it should also be compatible with Linux too.

Are there any cases where having a software TPM that's compatible with Windows 11 isn't enough for either Windows or Linux?

Well as the contributor noted, when upgrading there is no TPM added. Say you have an old Windows 10 machine you want to upgrade. Current workflow would be to detach the disk, and re-create a new virtual machine and specify windows 11 as OS.


I can confirm that changing an existing VM from BIOS => UEFI adds a tpm as expected.

[jelle@t14s][~/projects/arch-mkosi-boxes/images]%virsh dumpxml arch-test | grep tpm
    <tpm model='tpm-crb'>
      <alias name='tpm0'/>
    </tpm>

@garrett
Copy link
Member

garrett commented Sep 9, 2024

Should a UEFI VM without a TPM be considered a bug? Should we automatically add it? Or have an inline alert banner to add it, or something like that?

@Shotaro-Kawaguchi
Copy link
Contributor Author

@martinpitt If the implementation is such that pressing the add button adds the TPM device with all options set to default values, would it be possible to merge it?

@martinpitt
Copy link
Member

@Shotaro-Kawaguchi You are actually in a better position to answer that -- as you proposed and want the feature, you should also know how much configurability you need for your use case. From my point of view, I think that "defaults" would be a great start -- let virt-xml figure out the appropriate defaults, and add an integration test for adding and removal -- then we can land and refine with more options later if/when they become necessary.

Thanks!

@martinpitt
Copy link
Member

@Shotaro-Kawaguchi We just discussed this in our meeting. We feel that this is a too rare use case to put this into the overview all the time. If the VM has a TPM, then we don't need to show anything, and for most existing VMs you don't need to change it either.

A good place for these is the kebab menu, similar to the "Replace SPICE" functionality (which is a similar corner case):

dot menu

This could just grow an "Add TPM" action if the VM doesn't already have one. I don't think that you'd even need to remove it, or do you have such a use case?

Thanks!

@Shotaro-Kawaguchi
Copy link
Contributor Author

@martinpitt Thank you for your feedback.
Since there are no use cases to delete, we will place the "Add TPM" action in the kebab menu if the VM does not yet have a TPM.

@Shotaro-Kawaguchi

This comment was marked as resolved.

Copy link
Member

@martinpitt martinpitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Shotaro-Kawaguchi ! This looks nice now (aside from some code cleanup and the missing "needs shutdown" integration). This now needs tests -- do you want to work on them, or need help?

src/libvirt-xml-parse.js Outdated Show resolved Hide resolved
src/libvirt-xml-parse.js Outdated Show resolved Hide resolved
src/components/vm/vmActions.jsx Outdated Show resolved Hide resolved
src/components/vm/vmActions.jsx Outdated Show resolved Hide resolved
src/components/common/needsShutdown.jsx Show resolved Hide resolved
@Shotaro-Kawaguchi
Copy link
Contributor Author

Shotaro-Kawaguchi commented Oct 23, 2024

Thank you.
I will proceed with modifying the source code.

This now needs tests -- do you want to work on them, or need help?

Since I don't yet have the know-how to create a test program, is it possible to get some help?

@martinpitt
Copy link
Member

@Shotaro-Kawaguchi No worries. @mvollmer or I can spend some time on this next week and complete it with tests. So are you happy with the functionality now?

@Shotaro-Kawaguchi
Copy link
Contributor Author

@martinpitt Thank you for your support.
Yes, I am happy with the current functionality.

@martinpitt
Copy link
Member

This was conflicty, so I pushed a rebase. I want to get a first run of the tests to check for regressions. I'll work on a test today. Thanks!

@martinpitt martinpitt self-assigned this Oct 29, 2024
@martinpitt martinpitt added enhancement New feature or request and removed question Further information is requested needs-design labels Oct 29, 2024
@martinpitt martinpitt changed the title Add "view, edit, and add TPM devices" to overview Add "Add TPM" VM action Oct 29, 2024
@martinpitt
Copy link
Member

Next push: I added integration tests.

The code change is still more complicated than it needs to be: the changes to src/libvirt-xml-parse.js can be reverted, as the new info from it isn't actually being used anywhere.

@martinpitt
Copy link
Member

I think it would be good to get some feedback after adding a TPM.

Yeah, I stumbled over that when writing the tests -- you can only tell by inspecting the XML and the menu entry going away. Maybe a transient alert with "TPM added successfully" or so?

Maybe it is worth alerting the user when a VM does not have a TPM?

That's too noisy IMHO -- like, almost all existing VMs without a TPM don't need one.

@martinpitt
Copy link
Member

I think it would be good to get some feedback after adding a TPM.

Yeah, I stumbled over that when writing the tests -- you can only tell by inspecting the XML and the menu entry going away. Maybe a transient alert with "TPM added successfully" or so?

@garrett any opinion on this?

@@ -121,6 +125,10 @@ export function getDevicesRequiringShutdown(vm) {
if (needsShutdownSpice(vm))
devices.push(_("SPICE"));

// TPM
if (needsShutdownTpm(vm))
devices.push(_("TPM"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right you are, coverage tests. I added a check for adding a TPM to a running VM, and test that "needs restart" indicator. This also uncovered a bug: The "Add TPM" action does not go away in this state, i.e. TPM got added to the inactive XML, but not yet the running one.

I fixed that bug and added tests for the "needs shutdown" and "add TPM while running" cases.

@martinpitt
Copy link
Member

When adding a TPM for a running VM, there is at least some visual feedback:

image

There's none for a non-running VM, other than the menu entry going away. However, given how much of a fringe feature that is, I think we can handle a possible extra alert as a follow-up?

@martinpitt
Copy link
Member

Ugh, thereis an AppArmor problem on Debian, and this doesn't work at all on rhel 8 and arch. Back to yak shaving..

@martinpitt
Copy link
Member

martinpitt commented Nov 4, 2024

Ah, this is virt-xml being dumb:

ERROR unsupported configuration: TPM version '2.0' is not supported

(tested locally on arch, but I suppose it's the same on rhel-8-10). And we don't show error messages from a failed addition anywhere except for the log.

I reworked all that, should be better now.

@martinpitt
Copy link
Member

martinpitt commented Nov 4, 2024

Now we are down to starting the VM failing o fedora 41 and RHEL 10. This is actually a known issue, it just happens in a new test now, so we need to adjust the naughty: cockpit-project/bots#7064

martinpitt added a commit to martinpitt/bots that referenced this pull request Nov 4, 2024
cockpit-project/cockpit-machines#1796 adds TPM
testing to TestMachinesSettings.testMultipleSettings, which also fails
due to cockpit-project#6792. Add a corresponding pattern.
jelly pushed a commit to cockpit-project/bots that referenced this pull request Nov 4, 2024
cockpit-project/cockpit-machines#1796 adds TPM
testing to TestMachinesSettings.testMultipleSettings, which also fails
due to #6792. Add a corresponding pattern.
@martinpitt martinpitt removed the blocked label Nov 4, 2024
martinpitt and others added 4 commits November 4, 2024 16:45
Don't just sweep them under the rug -- this is actually important for
moving to EFI firmware. It'll become even more important with an
explicit "Add TPM" action that we'll add in the next step.

Unskip `testConfigureBeforeInstallBiosTPM` on RHEL 8 and Arch, and explicitly
check how it fails there.
This helps with upgrading old VMs to a newer guest OS that requires a
TPM, such as Windows 11.

Replace the virt-xml hackery in `testConfigureBeforeInstallBiosTPM` with
this corresponding new UI action. Add tests for adding TPM to both a
shutdown and a running VM.

Ignore swtpm AppArmor denial message in the test. We already had this in
`testConfigureBeforeInstallBiosTPM`, apply it to that new one as well,
and add/update the downstream bug reference. This got fixed in Ubuntu
(https://launchpad.net/bugs/1968187) but not in Debian yet.

Skip that test for RHEL 8 and Arch, where adding a TPM is known to not
work (same as in `testConfigureBeforeInstall()`).

Co-Authored-By: Martin Pitt <[email protected]>
This also checks TPM, so use the same ignore as in
testConfigureBeforeInstallBiosTPM.

This is a race condition/flake.
We want to write a precise naughty for some startup failures like
cockpit-project/bots#6792 . For that we need
the message on the console, not just on the UI.

It also makes human debugging easier.
Comment on lines +187 to +190
.catch(ex => onAddErrorNotification({
text: cockpit.format(_("Failed to add TPM to VM $0"), vm.name),
detail: ex.message,
resourceId: vm.id,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These 4 added lines are not executed by any test.

else
return Promise.reject(ex);
});
return cockpit.spawn(args, { err: "message", superuser: connectionName === "system" ? "try" : null });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This added line is not executed by any test.

@martinpitt martinpitt requested a review from jelly November 4, 2024 16:16
@martinpitt
Copy link
Member

@jelly: @mvollmer previously approved this, but it needed some more cleanup. Perhaps you can take over? (I'll also still wait for Garrett's sign-off)

Copy link
Member

@garrett garrett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, this looks reasonable to me, if I understand the following correctly:

  1. when creating a VM, Cockpit Machines automatically adds a TPM if it either needs or can benefit from a TPM
  2. when there's an existing machine, it has the ability to add a TPM in the menu, as shown in the screenshot
  3. when a TPM is added when the machine is not running, it adds it and everything's good, and it'll be there on next boot with no UI about TPMs or anything
  4. when a TPM is added when the machine is running, it has the info bubble that the machine needs to be shut down when, and then the TPM will be there
  5. when there's no TPM to add (as when a TPM already exists or when it doesn't make sense, like a VM in BIOS), there's no UI whatsoever (including: make sure there isn't a border above/below where the TPM entry would)

As long as everything on the list above is met and the screenshots reflect the current state, then I approve.

Thanks!

@martinpitt
Copy link
Member

@garrett Confirmed, all of these are true. There's not much UI impact here indeed (this was mostly wrestling with technicalities)

Copy link
Member

@jelly jelly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@@ -33,6 +33,11 @@ def hasMonolithicDaemon(image):
image in ["arch"])


# software TPM doesn't work on some OSes
def hasBrokenTPM(image):
return image.startswith("rhel-8") or image == 'arch'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The arch image needs swtpm should be easily fixable.

@jelly jelly merged commit 998da2a into cockpit-project:main Nov 5, 2024
30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants