Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storage: Avoid filling volume config on VolumeDBCreate` #14923

Draft
wants to merge 20 commits into
base: main
Choose a base branch
from

Conversation

hamistao
Copy link
Contributor

@hamistao hamistao commented Feb 5, 2025

This is a pre-requisite for #14595.

The whole idea here is that we want to adapt the provided volume config on VolumeDBCreate to populate volatile.rootfs.size without changing the original config passed in, as this would be a very easy and consistent way to keep track of volume sizes.

But we are currently relying on VolumeDBCreate to populate some fields within the volume for future use, which is giving many unrelated purposes to this function. So this instead proposed filling this config on GetNewVolume instead, as it makes sense for a new volume to have default conifg for the driver filled in. That, along with a few more tweaks, allows us to use VolumeDBCreate exclusively to create the database entry for the volume, adapting the volume config as needed without compromising the original config passed in as an argument.

This erros is always nil, so there is no reason for is to keep this return type.

Signed-off-by: hamistao <[email protected]>
`VolumeDBCreate` has a weird functionality of filling the volume with default config for its pool. Since the config keys that are filled up config affect the original config passed in and are sometimes necessary for subsequent operations on the volume beyond creating the DB entry for the volume. So this is part of the plan to move this operation elsewhere.

Signed-off-by: hamistao <[email protected]>
The intention is to cover for almost all cases where we were filling volume configuration on `VolumeDBCreate`. Now this makes more sense as it is expected for a newly created volume to have the default configuration for its pool.

Signed-off-by: hamistao <[email protected]>
Since `VolumeDBCreate` does not fill the config anymore, fill it beforehand.

Signed-off-by: hamistao <[email protected]>
@hamistao hamistao force-pushed the move_volume_config_filling branch from 503db49 to 6e9d608 Compare February 5, 2025 03:47
@hamistao hamistao force-pushed the move_volume_config_filling branch from 8a73064 to 1497ea0 Compare February 5, 2025 04:50
@@ -250,11 +250,15 @@ func (b *lxdBackend) Create(clientType request.ClientType, op *operations.Operat
}

// GetNewVolume returns a drivers.Volume that doesn't yet exist in the database.
// It contains copies of the supplied volume config, including a new UUID, and the pools config.
// It contains copies of the supplied volume config, including a new UUID and
// default configuration for the poo driver, as well as the pool's config.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets rephase to:

// It contains copies of the supplied volume config and populates a new UUID and any pool level defaults.

@@ -915,9 +915,12 @@ func (b *lxdBackend) CreateInstanceFromBackup(srcBackup backup.Info, srcData io.

newSnapshotName := drivers.GetSnapshotVolumeName(inst.Name(), backupFileSnap)

// Ensure driver specific config is filled in new volume.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Ensure driver specific default config is filled in new volume.

@@ -6447,8 +6447,7 @@ func (b *lxdBackend) ImportCustomVolume(projectName string, poolVol *backupConfi
for _, poolVolSnap := range poolVol.VolumeSnapshots {
fullSnapName := drivers.GetSnapshotVolumeName(poolVol.Volume.Name, poolVolSnap.Name)

// Copy volume config from backup file if present
// (so VolumeDBCreate can safely modify the copy if needed).
// Create new volume object to get a proper configuration for the driver in use.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Ensure driver specific default config is filled in new volume.

@@ -7481,8 +7478,7 @@ func (b *lxdBackend) ImportInstance(inst instance.Instance, poolVol *backupConfi
for _, poolVolSnap := range poolVol.VolumeSnapshots {
fullSnapName := drivers.GetSnapshotVolumeName(inst.Name(), poolVolSnap.Name)

// Copy volume config from backup file if present,
// so VolumeDBCreate can safely modify the copy if needed.
// Create new volume object to get a proper configuration for the driver in use.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Ensure driver specific default config is filled in new volume.

@@ -7503,8 +7499,7 @@ func (b *lxdBackend) ImportInstance(inst instance.Instance, poolVol *backupConfi
for _, i := range snapshots {
fullSnapName := i // Local var for revert.

// Copy the parent volume's config,
// so VolumeDBCreate can safely modify the copy if needed.
// Create new volume object to get a proper configuration for the driver in use.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Ensure driver specific default config is filled in new volume.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrm I'd be hesitant to start forcefully killing ongoing storage commands that are mutating the local or remote storage layers during a clean shutdown. This feels like its asking for trouble to me.

Was there a reason you didn't go with context.Background() here to ensure the commands safely completed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just failed to consider the risks of stopping these commands midway. Considering that doing so would simply leave the storage layers unharmed is far too optimistic.

@@ -289,7 +289,7 @@ func (d *common) moveGPTAltHeader(devPath string) error {
return nil
}

_, err = shared.RunCommand(path, "--move-second-header", devPath)
_, err = shared.RunCommandContext(d.state.ShutdownCtx, path, "--move-second-header", devPath)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i dont think we should use d.state.ShutdownCtx for anything mutating the disk, as Go will then forcefully kill the command, leaving it in an inconsistent state.

We've been bitten before by this in the past where we killed an ongoing ceph map command, which then left unrecoverable structures in the kernel and the system had to be rebooted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've been bitten before by this in the past where we killed an ongoing ceph map command, which then left unrecoverable structures in the kernel and the system had to be rebooted.

Always useful to know some history, thanks for the insight!
I will make the necessary updates.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general if a command isnt mutating something we can use d.state.shutdownCtx to exit more quickly, but if we're mutating something it'd be better to let it finish in my view (ofc if thats going to take a long time and some cleanup is going to happen on error, such as writing a large file out and then removing it on error in a defer then that would be different).

Copy link
Member

@tomponline tomponline left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets avoid introducing potential new issues by forcefully canceling commands that are mutating on disk structures during a clean shutdown. Use context.Background() for these until such time as you're prepared to test each one for its resulting behaviour.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants