Cannot remove log device from encrypted pool #8748

DanielSmedegaardBuus · 2019-05-15T13:38:35Z

DanielSmedegaardBuus
May 15, 2019

System information

Type	Version/Name
Distribution Name	Ubuntu Server amd64
Distribution Version	19.04
Linux Kernel	5.0.0-13-generic
Architecture	x64
ZFS Version	0.8.0-rc4
SPL Version	0.8.0-rc4

Describe the problem you're observing

I initially created my pool on macOS using O3X 1.8.2. The system was an older Mac mini with a 500 GB SSD, on which I reserved two partitions for a log and a cache device.

The mini started to have issues running the pool, possibly due to being rather underpowered (4G mem, 1.4GHz i5), so I decided to move it to a more beefy MacBook Pro. I exported on the mini, and imported it on the Pro, where of course the log and cache partitions were not available.

I've now moved the pool to Ubuntu Server as macOS just has so many issues running it stably and performant (which may not be due to ZFS itself, but possibly the USB subsystem, as this is a USB-connected pool.

When I tried to remove the orphaned log device, I got:

daniel@titanic ~ sudo zpool remove titanic 3470644392946588961
cannot remove 3470644392946588961: Mount encrypted datasets to replay logs.

Thing is, the encrypted datasets were mounted and happy (actually just the one, "titanic" itself (I haven't yet created any other filesystems)).

Later, I was at the point where I was ready to add back a new log and cache device on the SSD on the Ubuntu Server rig. So I thought, given that I couldn't remove the missing log device, I'd try replacing it first.

So, I created a partition on my Air's SSD, and did:

daniel@titanic ~ sudo zpool replace -f titanic 3470644392946588961 /dev/disk/by-partlabel/slog
cannot replace 3470644392946588961 with /dev/disk/by-partlabel/slog: new device has a different optimal sector size; use the option '-o ashift=N' to override the optimal size

This was a bit weird I thought (adding a log device to an ashift=12 pool seriously adds it as ashift=9 by default? Isn't that configuration pool-specific? I wasn't even aware that you could mix-and-match?), but adding -o ashift=9 allowed me to start a replacement of the log device. Of course, I didn't want an SSD log with ashift=9, but I was still thinking it might be a way to allow me to actually remove it.

The next thing happening was the pool now going into a resilver (why resilver to replace a log device?) at crazy speeds (looks related to this bug):

pool: titanic
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
   continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Tue May 14 23:01:45 2019
   1.04T scanned at 1.63G/s, 1.04T issued at 1.63G/s, 22.4T total
   0B resilvered, 4.62% done, 0 days 03:43:47 to go
config:

   NAME                                        STATE     READ WRITE CKSUM
   titanic                                     DEGRADED     0     0     0
     raidz3-0                                  ONLINE       0     0     0
       scsi-SWD_Elements_SE_25FF_WX31DA------  ONLINE       0     0     0
       scsi-SWD_Elements_SE_25FF_WX51D4------  ONLINE       0     0     0
       wwn-0x5000000000000001                  ONLINE       0     0     0
       scsi-SWD_Elements_SE_25FF_WX61D5------  ONLINE       0     0     0
       scsi-SWD_Elements_SE_25FF_WX31D6------  ONLINE       0     0     0
       scsi-SWD_Elements_SE_25FF_WX31D6------  ONLINE       0     0     0
       scsi-SWD_Elements_SE_25FF_WXK1E9------  ONLINE       0     0 17.1K
       scsi-SWD_Elements_SE_25FF_WX31D6------  ONLINE       0     0     0
       scsi-SWD_Elements_SE_25FF_WX31D6------  ONLINE       0     0     0
       scsi-SWD_Elements_SE_25FF_WX31D6------  ONLINE       0     0     0
       scsi-SWD_Elements_SE_25FF_WX71D6------  ONLINE       0     0     0
   logs
     replacing-1                               DEGRADED     0     0     0
       slog                                    ONLINE       0     0     0

EDIT NOTE: There's a couple of lines missing here at the end. I copied this from an O3X forum post I made before this one, and I have apparently made a bad copy. The actual output did include the original log device's GUID, which was 3470644392946588961 under the logs > replacing-1 section. Just FYI.

This was yesterday, and it never really finished or actually resilvered anything. Today, I wanted to replace the device that is listing 17.1K checksum errors above, but after doing sudo zpool replace titanic -o ashift=12 scsi-SWD_Elements_SE_25FF_WXK1E98EZ514 /dev/md127, I was still stuck in this broken resilver loop. Though I think this is actually unrelated, so I should probably just stop talking about it.

About the actual issue! :D I had to export the pool, remove the partition from the SSD for the new log device that was being replaced in, and re-import it to get it to not continue trying to "resilver it into the pool." Again, it didn't seem to actually be doing anything, but it was stuck in the resilvering loop.

The current status now is this:

  pool: titanic
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
	continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Wed May 15 14:58:51 2019
	806G scanned at 437M/s, 270G issued at 146M/s, 22.4T total
	24.6G resilvered, 1.18% done, 1 days 20:02:20 to go
config:

	NAME                                          STATE     READ WRITE CKSUM
	titanic                                       DEGRADED     0     0     0
	  raidz3-0                                    ONLINE       0     0     0
	    scsi-SWD_Elements_SE_25FF_WX31DA8D612X    ONLINE       0     0     0
	    scsi-SWD_Elements_SE_25FF_WX51D48ANJJE    ONLINE       0     0     0
	    wwn-0x5000000000000001                    ONLINE       0     0     0
	    scsi-SWD_Elements_SE_25FF_WX61D583R8YJ    ONLINE       0     0     0
	    scsi-SWD_Elements_SE_25FF_WX31D685NH4X    ONLINE       0     0     0
	    scsi-SWD_Elements_SE_25FF_WX31D685N9PE    ONLINE       0     0     0
	    replacing-6                               ONLINE       0     0     0
	      scsi-SWD_Elements_SE_25FF_WXK1E98EZ514  ONLINE       0     0     0
	      md127                                   ONLINE       0     0     0  (resilvering)
	    scsi-SWD_Elements_SE_25FF_WX31D68C3HU2    ONLINE       0     0     0
	    scsi-SWD_Elements_SE_25FF_WX31D68C32HZ    ONLINE       0     0     0
	    scsi-SWD_Elements_SE_25FF_WX31D68RUL25    ONLINE       0     0     0
	    scsi-SWD_Elements_SE_25FF_WX71D686LYUZ    ONLINE       0     0     0
	logs
	  replacing-1                                 UNAVAIL      0     0     0  insufficient replicas
	    1847322803164150369                       UNAVAIL      0     0     0  was /dev/disk/by-partlabel/slog

errors: No known data errors

This resilvering is for the device I'm actually currently replacing, so just disregard that. It's running along nicely, and it's not the fact that the pool is resilvering right now that is causing me to not be able to remove the log device, because I've tried this for a long time now, with the pool being in a normal state.

My attempts at removing the log device now:

daniel@titanic ~ sudo zpool remove titanic replacing-1
cannot remove replacing-1: Mount encrypted datasets to replay logs.

Again, the pool is mounted, and writeable.

daniel@titanic ~ sudo zpool remove titanic 1847322803164150369
cannot remove 1847322803164150369: operation not supported on this type of pool

I read a similar post online where the issue had to do with the log device being a mirror. This is not my case. The log device, when it was there, was just /dev/sda4 on my SSD.

Describe how to reproduce the problem

Not to sure. I don't have any more drives available to do something like this. But I hope the above description helps. I initially created the pool (on macOS with O3X 1.8.2) with sudo zpool create -d -f -o ashift=12 -o feature@encryption=enabled -o feature@lz4_compress=enabled -o feature@async_destroy=enabled -o feature@bookmarks=enabled -o feature@device_removal=enabled -o feature@embedded_data=enabled -o feature@empty_bpobj=enabled -o feature@enabled_txg=enabled -o feature@extensible_dataset=enabled -o feature@hole_birth=enabled -o feature@obsolete_counts=enabled -o feature@spacemap_histogram=enabled -o feature@spacemap_v2=enabled -o feature@zpool_checkpoint=enabled -o feature@filesystem_limits=enabled -o feature@multi_vdev_crash_dump=enabled -o feature@edonr=enabled -o feature@sha512=enabled -o feature@skein=enabled -o feature@large_blocks=enabled -O atime=off -O compress=lz4 -O encryption=on -O keylocation=file:///zfskey-titanic -O keyformat=passphrase -O normalization=formD titanic raidz3 disk4 disk5 disk6 disk7 disk8 disk9 disk10 disk11 disk12 disk13 disk14

Include any warning/errors/backtraces from the system logs

I'm not sure exactly what to provide. Please let me know if I can provide anything.

Thank you! :D

Answered by behlendorf

Dec 21, 2020

With OpenZFS 2.0 we've fixed a few issues in the area and updated the documentation to mention that the encryption keys need to loaded in order to remove a log device from the pool.

View full answer

behlendorf · 2019-05-15T14:40:19Z

behlendorf
May 15, 2019
Maintainer

@DanielSmedegaardBuus thanks for the clear description of the issue zpool remove <pool> <log> issue. Is it possible that your pool contains zfs volumes? Similar to filesystems these may also prevent you from removing the log device.

Regarding ashift it's actually set per top-level vdev. So it may be different between the primary pool storage and the log device.

0 replies

DanielSmedegaardBuus · 2019-05-16T09:09:27Z

DanielSmedegaardBuus
May 16, 2019
Author

Thank you, @behlendorf, for picking up my report :)

Good to know about the ashift caveat! I'll make sure to specify at least ashift=12 when adding new log and cache devices.

About volumes, then no. These are my current resources:

daniel@titanic ~ zfs list -t filesystem
NAME      USED  AVAIL     REFER  MOUNTPOINT
scratch  1.21T   561G     1.21T  /scratch
titanic  16.3T  11.9T     16.2T  /titanic

daniel@titanic ~ zfs list -t snapshot
NAME                               USED  AVAIL     REFER  MOUNTPOINT
titanic@1                         30.3G      -     3.24T  -
titanic@lifeboat1in               24.5M      -     4.43T  -
titanic@3-lifeboat5-in            1.14M      -     6.18T  -
titanic@4-lifeboat2-in             558K      -     7.93T  -
titanic@5-lifeboat3-in             674K      -     9.69T  -
titanic@6-lifeboat4-in            5.65M      -     11.4T  -
titanic@7-lifeboat6-in            2.36M      -     13.2T  -
titanic@8-lifeboat7-in-pls-chk    1.15M      -     14.9T  -
titanic@bblicenseincoming         3.98M      -     16.0T  -
titanic@resynced-lb8              1.63M      -     16.2T  -
titanic@resynced-lb7              1.43M      -     16.2T  -
titanic@resynced-lb6and4          2.16M      -     16.2T  -
titanic@resynced-lb3              1.04M      -     16.2T  -

daniel@titanic ~ zfs list -t volume
no datasets available

Could the snapshots perhaps play in here? I'll destroy them and try again.

(BTW: /scratch is a single-disk zpool for temp stuffs)

0 replies

DanielSmedegaardBuus · 2019-05-16T09:32:23Z

DanielSmedegaardBuus
May 16, 2019
Author

Okay, so I destroyed all snapshots and tried again, but still no donut:

daniel@titanic ~ sudo zpool detach titanic replacing-1
cannot detach replacing-1: only applicable to mirror and replacing vdevs

daniel@titanic ~ sudo zpool detach titanic 1847322803164150369
cannot detach 1847322803164150369: no valid replicas

daniel@titanic ~ sudo zpool remove titanic replacing-1
cannot remove replacing-1: Mount encrypted datasets to replay logs.

daniel@titanic ~ sudo zpool remove titanic 1847322803164150369
cannot remove 1847322803164150369: operation not supported on this type of pool

Here's a zdb output in case it reveals anything:

titanic:
    version: 5000
    name: 'titanic'
    state: 0
    txg: 674112
    pool_guid: 16521893545764552695
    errata: 4
    hostname: 'titanic'
    com.delphix:has_per_vdev_zaps
    vdev_children: 2
    vdev_tree:
        type: 'root'
        id: 0
        guid: 16521893545764552695
        create_txg: 4
        children[0]:
            type: 'raidz'
            id: 0
            guid: 4659468024210879297
            nparity: 3
            metaslab_array: 49
            metaslab_shift: 38
            ashift: 12
            asize: 44001052327936
            is_log: 0
            create_txg: 4
            com.delphix:vdev_zap_top: 37
            children[0]:
                type: 'disk'
                id: 0
                guid: 441859027389969777
                path: '/dev/disk/by-id/scsi-SWD_Elements_SE_25FF_WX31---------part1'
                whole_disk: 1
                DTL: 66
                create_txg: 4
                com.delphix:vdev_zap_leaf: 38
            children[1]:
                type: 'disk'
                id: 1
                guid: 9280418592635397006
                path: '/dev/disk/by-id/scsi-SWD_Elements_SE_25FF_WX51---------part1'
                whole_disk: 1
                DTL: 75
                create_txg: 4
                com.delphix:vdev_zap_leaf: 39
            children[2]:
                type: 'disk'
                id: 2
                guid: 11180904176649316025
                path: '/dev/disk/by-id/wwn-0x5000000000000001-part1'
                whole_disk: 1
                DTL: 76
                create_txg: 4
                com.delphix:vdev_zap_leaf: 40
            children[3]:
                type: 'disk'
                id: 3
                guid: 12559400542100398552
                path: '/dev/disk/by-id/scsi-SWD_Elements_SE_25FF_WX61---------part1'
                whole_disk: 1
                DTL: 94
                create_txg: 4
                com.delphix:vdev_zap_leaf: 41
            children[4]:
                type: 'disk'
                id: 4
                guid: 2832683662629342739
                path: '/dev/disk/by-id/scsi-SWD_Elements_SE_25FF_WX31---------part1'
                whole_disk: 1
                DTL: 93
                create_txg: 4
                com.delphix:vdev_zap_leaf: 42
            children[5]:
                type: 'disk'
                id: 5
                guid: 6026478568833279879
                path: '/dev/disk/by-id/scsi-SWD_Elements_SE_25FF_WX31---------part1'
                whole_disk: 1
                DTL: 92
                create_txg: 4
                com.delphix:vdev_zap_leaf: 43
            children[6]:
                type: 'replacing'
                id: 6
                guid: 10229782760370588424
                whole_disk: 0
                create_txg: 4
                children[0]:
                    type: 'disk'
                    id: 0
                    guid: 6419303814572286035
                    path: '/dev/disk/by-id/scsi-SWD_Elements_SE_25FF_WXK1---------part1'
                    whole_disk: 1
                    DTL: 270
                    create_txg: 4
                    com.delphix:vdev_zap_leaf: 263
                children[1]:
                    type: 'disk'
                    id: 1
                    guid: 2136406140341623360
                    path: '/dev/md127p1'
                    whole_disk: 1
                    DTL: 295
                    create_txg: 4
                    com.delphix:vdev_zap_leaf: 294
                    resilver_txg: 674001
            children[7]:
                type: 'disk'
                id: 7
                guid: 10176258574433560930
                path: '/dev/disk/by-id/scsi-SWD_Elements_SE_25FF_WX31---------part1'
                whole_disk: 1
                DTL: 90
                create_txg: 4
                com.delphix:vdev_zap_leaf: 45
            children[8]:
                type: 'disk'
                id: 8
                guid: 189953569063301903
                path: '/dev/disk/by-id/scsi-SWD_Elements_SE_25FF_WX31---------part1'
                whole_disk: 1
                DTL: 89
                create_txg: 4
                com.delphix:vdev_zap_leaf: 46
            children[9]:
                type: 'disk'
                id: 9
                guid: 11557892718334905015
                path: '/dev/disk/by-id/scsi-SWD_Elements_SE_25FF_WX31---------part1'
                whole_disk: 1
                DTL: 88
                create_txg: 4
                com.delphix:vdev_zap_leaf: 47
            children[10]:
                type: 'disk'
                id: 10
                guid: 7385876886247550529
                path: '/dev/disk/by-id/scsi-SWD_Elements_SE_25FF_WX71---------part1'
                whole_disk: 1
                DTL: 87
                create_txg: 4
                com.delphix:vdev_zap_leaf: 48
        children[1]:
            type: 'replacing'
            id: 1
            guid: 5739250105803886075
            whole_disk: 0
            metaslab_array: 99
            metaslab_shift: 29
            ashift: 9
            asize: 17175150592
            is_log: 1
            create_txg: 32757
            com.delphix:vdev_zap_top: 98
            children[0]:
                type: 'disk'
                id: 0
                guid: 3470644392946588961
                path: '/private/var/run/disk/by-id/media-381A6E28-4FBC-487F-B6E1-F4E82C79D83F'
                whole_disk: 0
                not_present: 1
                DTL: 177
                create_txg: 32757
                com.delphix:vdev_zap_leaf: 97
            children[1]:
                type: 'disk'
                id: 1
                guid: 1847322803164150369
                path: '/dev/disk/by-partlabel/slog'
                whole_disk: 0
                not_present: 1
                DTL: 388
                create_txg: 32757
                com.delphix:vdev_zap_leaf: 288
                resilver_txg: 672507
    features_for_read:
        com.delphix:embedded_data
        com.delphix:hole_birth

BTW: Those dashes in the device names are from me redacting serial numbers. Not sure if that's just paranoid, it just seems like most people do that, so I'm following suit :)

0 replies

darrenfreeman · 2019-06-08T09:52:41Z

darrenfreeman
Jun 8, 2019

I have hit this also. Tried to remove a SLOG from a pool, and was told to supply the key for every encrypted dataset in order to replay the log.

If this happens often, it basically breaks the promise of sending encrypted datasets to untrusted machines, and not having to supply the key to do basic pool maintenance. Hopefully it can only happen if the dataset was actually mounted prior to a shutdown?

It's quite possible there was a slightly unclean shutdown in my case, as due to another outstanding issue, all pools need to be manually imported on startup with the latest stable Debian and 0.8.0 compiled from source. I'm sorry I don't remember the issue number, but it was discussed at length, there was a workaround involving creation of a symlink, but I seem to have lost it recently in going to 0.8.0.

0 replies

tcaputi · 2019-06-10T19:48:55Z

tcaputi
Jun 10, 2019

@DanielSmedegaardBuus Hmm. This is interesting. Can you post a couple bits of debugging output? In particular, I would be interested in the output of:

zfs get mounted,mountpoint

and

echo 4294966270 > /sys/module/zfs/parameters/zfs_flags
echo 1> /sys/module/zfs/parameters/zfs_dbgmsg_enable
zpool remove titanic replacing-1
cat /proc/spl/kstat/zfs/dbgmsg

Hopefully that should give us enough info to figure out what's going on here.

0 replies

Harvie · 2019-11-13T20:08:05Z

Harvie
Nov 13, 2019

As i stated in #9576 i have the same problem on unencrypted pool:

zpool remove tank 4929563301382628102; cat /proc/spl/kstat/zfs/dbgmsg > zfslog.txt

# zfs get mounted,mountpoint,encryption
NAME                                          PROPERTY    VALUE                        SOURCE
tank                                          mounted     yes                          -
tank                                          mountpoint  /tank                        default
tank                                          encryption  off                          default
tank/backup                                   mounted     yes                          -
tank/backup                                   mountpoint  /tank/backup                 default
tank/backup                                   encryption  off                          default
tank/local                                    mounted     yes                          -
tank/local                                    mountpoint  /tank/local                  default
tank/local                                    encryption  off                          default
tank/vps                                      mounted     yes                          -
tank/vps                                      mountpoint  /tank/vps                    default
tank/vps                                      encryption  off                          default
tank/vps/subvol-109-disk-0                    mounted     yes                          -
tank/vps/subvol-109-disk-0                    mountpoint  /tank/vps/subvol-109-disk-0  default
tank/vps/subvol-109-disk-0                    encryption  off                          default
tank/vps/subvol-116-disk-0                    mounted     yes                          -
tank/vps/subvol-116-disk-0                    mountpoint  /tank/vps/subvol-116-disk-0  default
tank/vps/subvol-116-disk-0                    encryption  off                          default
tank/vps/subvol-116-disk-0@predockerem        mounted     -                            -
tank/vps/subvol-116-disk-0@predockerem        mountpoint  -                            -
tank/vps/subvol-116-disk-0@predockerem        encryption  off                          default
tank/vps/subvol-117-disk-0                    mounted     yes                          -
tank/vps/subvol-117-disk-0                    mountpoint  /tank/vps/subvol-117-disk-0  default
tank/vps/subvol-117-disk-0                    encryption  off                          default
tank/vps/subvol-121-disk-0                    mounted     yes                          -
tank/vps/subvol-121-disk-0                    mountpoint  /tank/vps/subvol-121-disk-0  default
tank/vps/subvol-121-disk-0                    encryption  off                          default
tank/vps/subvol-205-disk-0                    mounted     yes                          -
tank/vps/subvol-205-disk-0                    mountpoint  /tank/vps/subvol-205-disk-0  default
tank/vps/subvol-205-disk-0                    encryption  off                          default
tank/vps/subvol-205-disk-0@2019-03-31-224500  mounted     -                            -
tank/vps/subvol-205-disk-0@2019-03-31-224500  mountpoint  -                            -
tank/vps/subvol-205-disk-0@2019-03-31-224500  encryption  off                          default
tank/vps/subvol-205-disk-0@2019-03-31-230000  mounted     -                            -
tank/vps/subvol-205-disk-0@2019-03-31-230000  mountpoint  -                            -
tank/vps/subvol-205-disk-0@2019-03-31-230000  encryption  off                          default
tank/vps/subvol-205-disk-0@2019-03-31-231500  mounted     -                            -
tank/vps/subvol-205-disk-0@2019-03-31-231500  mountpoint  -                            -
tank/vps/subvol-205-disk-0@2019-03-31-231500  encryption  off                          default
tank/vps/subvol-205-disk-0@2019-03-31-233000  mounted     -                            -
tank/vps/subvol-205-disk-0@2019-03-31-233000  mountpoint  -                            -
tank/vps/subvol-205-disk-0@2019-03-31-233000  encryption  off                          default
tank/vps/subvol-205-disk-0@2019-03-31-234013  mounted     -                            -
tank/vps/subvol-205-disk-0@2019-03-31-234013  mountpoint  -                            -
tank/vps/subvol-205-disk-0@2019-03-31-234013  encryption  off                          default
tank/vps/subvol-205-disk-1                    mounted     yes                          -
tank/vps/subvol-205-disk-1                    mountpoint  /tank/vps/subvol-205-disk-1  default
tank/vps/subvol-205-disk-1                    encryption  off                          default
tank/vps/subvol-205-disk-1@2019-03-31-224500  mounted     -                            -
tank/vps/subvol-205-disk-1@2019-03-31-224500  mountpoint  -                            -
tank/vps/subvol-205-disk-1@2019-03-31-224500  encryption  off                          default
tank/vps/subvol-205-disk-1@2019-03-31-230000  mounted     -                            -
tank/vps/subvol-205-disk-1@2019-03-31-230000  mountpoint  -                            -
tank/vps/subvol-205-disk-1@2019-03-31-230000  encryption  off                          default
tank/vps/subvol-205-disk-1@2019-03-31-231500  mounted     -                            -
tank/vps/subvol-205-disk-1@2019-03-31-231500  mountpoint  -                            -
tank/vps/subvol-205-disk-1@2019-03-31-231500  encryption  off                          default
tank/vps/subvol-205-disk-1@2019-03-31-233000  mounted     -                            -
tank/vps/subvol-205-disk-1@2019-03-31-233000  mountpoint  -                            -
tank/vps/subvol-205-disk-1@2019-03-31-233000  encryption  off                          default
tank/vps/subvol-205-disk-1@2019-03-31-234024  mounted     -                            -
tank/vps/subvol-205-disk-1@2019-03-31-234024  mountpoint  -                            -
tank/vps/subvol-205-disk-1@2019-03-31-234024  encryption  off                          default
tank/vps/subvol-218-disk-0                    mounted     yes                          -
tank/vps/subvol-218-disk-0                    mountpoint  /tank/vps/subvol-218-disk-0  default
tank/vps/subvol-218-disk-0                    encryption  off                          default
tank/vps/subvol-218-disk-0@mujsnapshot        mounted     -                            -
tank/vps/subvol-218-disk-0@mujsnapshot        mountpoint  -                            -
tank/vps/subvol-218-disk-0@mujsnapshot        encryption  off                          default
tank/vps/subvol-238-disk-0                    mounted     yes                          -
tank/vps/subvol-238-disk-0                    mountpoint  /tank/vps/subvol-238-disk-0  default
tank/vps/subvol-238-disk-0                    encryption  off                          default
tank/vps/subvol-249-disk-0                    mounted     yes                          -
tank/vps/subvol-249-disk-0                    mountpoint  /tank/vps/subvol-249-disk-0  default
tank/vps/subvol-249-disk-0                    encryption  off                          default
tank/vps/subvol-300-disk-1                    mounted     yes                          -
tank/vps/subvol-300-disk-1                    mountpoint  /tank/vps/subvol-300-disk-1  default
tank/vps/subvol-300-disk-1                    encryption  off                          default
tank/vps/subvol-300-disk-1@bezmailu           mounted     -                            -
tank/vps/subvol-300-disk-1@bezmailu           mountpoint  -                            -
tank/vps/subvol-300-disk-1@bezmailu           encryption  off                          default
tank/vps/subvol-301-disk-1                    mounted     yes                          -
tank/vps/subvol-301-disk-1                    mountpoint  /tank/vps/subvol-301-disk-1  default
tank/vps/subvol-301-disk-1                    encryption  off                          default
tank/vps/subvol-302-disk-0                    mounted     yes                          -
tank/vps/subvol-302-disk-0                    mountpoint  /tank/vps/subvol-302-disk-0  default
tank/vps/subvol-302-disk-0                    encryption  off                          default
tank/vps/subvol-303-disk-0                    mounted     yes                          -
tank/vps/subvol-303-disk-0                    mountpoint  /tank/vps/subvol-303-disk-0  default
tank/vps/subvol-303-disk-0                    encryption  off                          default
tank/vps/subvol-303-disk-1                    mounted     yes                          -
tank/vps/subvol-303-disk-1                    mountpoint  /tank/vps/subvol-303-disk-1  default
tank/vps/subvol-303-disk-1                    encryption  off                          default
tank/vps/subvol-330-disk-1                    mounted     yes                          -
tank/vps/subvol-330-disk-1                    mountpoint  /tank/vps/subvol-330-disk-1  default
tank/vps/subvol-330-disk-1                    encryption  off                          default
tank/vps/subvol-350-disk-1                    mounted     yes                          -
tank/vps/subvol-350-disk-1                    mountpoint  /tank/vps/subvol-350-disk-1  default
tank/vps/subvol-350-disk-1                    encryption  off                          default

0 replies

tcaputi · 2019-11-13T20:50:54Z

tcaputi
Nov 13, 2019

@Harvie

So it looks like the code that's causingg your issue is actually code thats been around for a long time. The commandline utility, however, is misinterpreting that EBUSY error code and turning it into a message about encryption. We should fix the error message to be more clear, but the actual ffending line of code is:

	if (zh->zh_flags & ZIL_REPLAY_NEEDED) {		/* unplayed log */
		mutex_exit(&zilog->zl_lock);
		dmu_objset_rele(os, suspend_tag);
		return (SET_ERROR(EBUSY));
	}

@behlendorf Who would know about this bit of code? Does he just need to do a zfs mount -a to resolve this issue? Is there a way we could do this internally as part of the device removal process instead of forcing users to do it manually?

0 replies

Harvie · 2019-11-13T21:10:43Z

Harvie
Nov 13, 2019

@tcaputi So the line says that ZIL_REPLAY_NEEDED...
zpool status says my pool is DEGRADED and has errors: errors: 2 data errors, use '-v' for a list

errors: Permanent errors have been detected in the following files:

        tank/vps/subvol-1206-disk-0:<0x0>
        tank/vps/subvol-1301-disk-1:<0x0>

What happens if i don't replay the log? Eg. i've lost the harddrive with the log device. All datasets are mounted and under use. Does that mean they are ok? At least i an way that data errors will not cause further data corruption? Like there might have been file corrupted or missing, but i will not lose data anymore when using this filesystem, storing and modifying data on it. Or do i?

How can i remove the device that i cannot replay because i've lost it?

0 replies

behlendorf · 2019-11-13T22:07:14Z

behlendorf
Nov 13, 2019
Maintainer

There is a module option which allows us to discard the zfs intent log instead of replaying it. You could try setting zil_replay_disable=1 before importing the pool, then mounting all of the filesystems (volumes will be automatically handled). Once all of the logs have been replayed you should be able to remove the log device.

	NAME                        STATE     READ WRITE CKSUM
	tank                        DEGRADED     0     0     0
	  mirror-0                  ONLINE       0     0     0
	    wwn-0x5000c5006e9e2a92  ONLINE       0     0     0
	    wwn-0x5000c5006e9d9050  ONLINE       0     0     0
	logs	
>>>	  4929563301382628102       UNAVAIL      0     0     0  was /dev/disk/by-id/ata-TS32GSSD370S_C160932696-part1

That said, I completely agree that really there should be a zpool remove flag which could be used to request the ZIL be discarded. With an appropriate warnings of course. @prakashsurya or @amotin may have some thoughts about this

0 replies

Harvie · 2019-11-13T23:13:51Z

Harvie
Nov 13, 2019

There is a module option which allows us to discard the zfs intent log instead of replaying it.

So can i be sure that discarding ZIL will not cause any further data corruption?

What about replaying the ZIL on pool/datasets that have been mounted and used since the pool was degraded. Is that safe?

0 replies

tcaputi · 2019-11-15T16:29:03Z

tcaputi
Nov 15, 2019

The ZIL for each filesystem only contains data that was written in the last few seconds before the system crashed or the filesystem was unmounted. You should not have any corruption from removing a ZIL, although you may lose that last little bit of data. Replaying the ZIL should not cause any further corruption either, although to be honest I'm not 100% sure what will happen in that case, but I'm about 99% sure the ZIL will just be discarded.

0 replies

prakashsurya · 2019-11-15T20:22:09Z

prakashsurya
Nov 15, 2019
Collaborator

You should not have any corruption from removing a ZIL, although you may lose that last little bit of data.

I think it's important to define "any corruption" when used in a sentence like this (since naive users may read this, and get the wrong idea). For example, the application that wrote the data that's contained in the ZIL, may detect "corruption"; but ZFS itself will not detect any "corruption".

0 replies

Harvie · 2019-11-18T17:52:52Z

Harvie
Nov 18, 2019

I've tried doing echo 1 > /sys/module/zfs/parameters/zil_replay_disable and running zpool remove ... again and it didn't helped. Don't really want to reboot right now...

0 replies

Harvie · 2019-12-01T02:52:07Z

Harvie
Dec 1, 2019

So i rebooted with zfs.zil_replay_disable=1 and checked it applied properly.
The pool was not started after boot. Said cannot import 'tank': one or more devices is currently unavailable. Had to do zpool import -a -m
Then i tried zpool remove tank MYLOG, still sais:
cannot remove MYLOG: Mount encrypted datasets to replay logs.

there's still obviously some problem.

# modinfo zfs
filename:       /lib/modules/5.0.21-5-pve/zfs/zfs.ko
version:        0.8.2-pve2

0 replies

behlendorf · 2020-12-21T20:48:07Z

behlendorf
Dec 21, 2020
Maintainer

With OpenZFS 2.0 we've fixed a few issues in the area and updated the documentation to mention that the encryption keys need to loaded in order to remove a log device from the pool.

0 replies

stevefan1999-personal · 2021-01-02T06:18:07Z

stevefan1999-personal
Jan 2, 2021

I faced this problem as well, but I hasn't even encrypted my drive at all, so I'm trying to look for similar answers here.

In case anyone wondered here is my zpool status and remove command results:

root@serv:~# zpool status
  pool: rpool
 state: ONLINE
  scan: resilvered 118M in 0 days 00:00:55 with 0 errors on Thu Dec 31 21:44:37 2020
config:

        NAME                                                 STATE     READ WRITE CKSUM
        rpool                                                ONLINE       0     0     0
          raidz1-0                                           ONLINE       0     0     0
            raidz-0-part1                                    ONLINE       0     0     0
            sdb3                                             ONLINE       0     0     0
            ata-TOSHIBA_DT01ACA300_79SD90KAS-part3           ONLINE       0     0     0
            scsi-35000cca03b870168-part2                     ONLINE       0     0     0
        logs
          ata-KINGSTON_SUV400S37240G_50026B776502F828-part1  ONLINE       0     0     0

errors: No known data errors

  pool: rpool2
 state: ONLINE
  scan: resilvered 1.99T in 0 days 05:36:36 with 0 errors on Sun Dec 27 23:45:46 2020
config:

        NAME                                            STATE     READ WRITE CKSUM
        rpool2                                          ONLINE       0     0     0
          mirror-0                                      ONLINE       0     0     0
            ata-TOSHIBA_MG07ACA12TE_89W0A0DQFBXG-part2  ONLINE       0     0     0
            ata-TOSHIBA_MG07ACA12TE_5020A0Z2FDUG-part2  ONLINE       0     0     0
errors: No known data errors
root@serv:~# zpool remove rpool ata-KINGSTON_SUV400S37240G_50026B776502F828-part1
cannot remove ata-KINGSTON_SUV400S37240G_50026B776502F828-part1: Mount encrypted datasets to replay logs.

0 replies

stevefan1999-personal · 2021-01-02T06:53:57Z

stevefan1999-personal
Jan 2, 2021

oh and i see that when i started doing write operation to rpool, the entire rpool frozen and made no progress. it was later I spent an entire evening transferring my ZFS root from rpool to rpool2 (remember reading is fine but writing is not) and start it there just right now I see that zed is stucking in a "zombie" like situation.

zed occupied an entire core timeslice (as zed should indeed be single-threaded) and I tried to kill it via any means. Not even SIGKILL did anything. All the ZFS related operations stuck (including zpool, zfs) and the kernel also prints I/O hung message in dmesg:

[Sat Jan  2 14:20:10 2021] INFO: task txg_sync:5722 blocked for more than 120 seconds.
[Sat Jan  2 14:20:10 2021]       Tainted: P           O      5.4.78-2-pve #1
[Sat Jan  2 14:20:10 2021] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sat Jan  2 14:20:10 2021] txg_sync        D    0  5722      2 0x80004000
[Sat Jan  2 14:20:10 2021] Call Trace:
[Sat Jan  2 14:20:10 2021]  __schedule+0x2e6/0x6f0
[Sat Jan  2 14:20:10 2021]  ? _cond_resched+0x19/0x30
[Sat Jan  2 14:20:10 2021]  schedule+0x33/0xa0
[Sat Jan  2 14:20:10 2021]  cv_wait_common+0x104/0x130 [spl]
[Sat Jan  2 14:20:10 2021]  ? wait_woken+0x80/0x80
[Sat Jan  2 14:20:10 2021]  __cv_wait+0x15/0x20 [spl]
[Sat Jan  2 14:20:10 2021]  spa_config_enter+0xfb/0x110 [zfs]
[Sat Jan  2 14:20:10 2021]  spa_sync+0x197/0xfa0 [zfs]
[Sat Jan  2 14:20:10 2021]  ? _cond_resched+0x19/0x30
[Sat Jan  2 14:20:10 2021]  ? mutex_lock+0x12/0x30
[Sat Jan  2 14:20:10 2021]  ? spa_txg_history_set.part.7+0xba/0xe0 [zfs]
[Sat Jan  2 14:20:10 2021]  ? spa_txg_history_init_io+0x104/0x110 [zfs]
[Sat Jan  2 14:20:10 2021]  txg_sync_thread+0x2d6/0x480 [zfs]
[Sat Jan  2 14:20:10 2021]  ? txg_thread_exit.isra.13+0x60/0x60 [zfs]
[Sat Jan  2 14:20:10 2021]  thread_generic_wrapper+0x74/0x90 [spl]
[Sat Jan  2 14:20:10 2021]  kthread+0x120/0x140
[Sat Jan  2 14:20:10 2021]  ? __thread_exit+0x20/0x20 [spl]
[Sat Jan  2 14:20:10 2021]  ? kthread_park+0x90/0x90
[Sat Jan  2 14:20:10 2021]  ret_from_fork+0x1f/0x40
[Sat Jan  2 14:20:10 2021] INFO: task zfs:23550 blocked for more than 120 seconds.
[Sat Jan  2 14:20:10 2021]       Tainted: P           O      5.4.78-2-pve #1

While zed is technically a mutant "zombie", can it really be considered like that? Since it is a core part of ZFS to be the event processing daemon, so killing it might mean certain death and thus there must be some kind of kernel level magic to prevent this process to be killed in the first place...thus:

root@serv:~# systemctl status zfs-zed
● zfs-zed.service - ZFS Event Daemon (zed)
   Loaded: loaded (/lib/systemd/system/zfs-zed.service; enabled; vendor preset: enabled)
   Active: failed (Result: timeout) since Sat 2021-01-02 14:41:33 HKT; 7s ago
     Docs: man:zed(8)
 Main PID: 16833
    Tasks: 2 (limit: 4915)
   Memory: 6.5M
   CGroup: /system.slice/zfs-zed.service
           └─16833 [zed]

Jan 02 14:13:45 serv zed[50174]: eid=16 class=history_event pool_guid=0xA968867C4BEF591F
Jan 02 14:35:33 serv systemd[1]: Stopping ZFS Event Daemon (zed)...
Jan 02 14:37:03 serv systemd[1]: zfs-zed.service: State 'stop-sigterm' timed out. Killing.
Jan 02 14:37:03 serv systemd[1]: zfs-zed.service: Killing process 16833 (zed) with signal SIGKILL.
Jan 02 14:38:33 serv systemd[1]: zfs-zed.service: Processes still around after SIGKILL. Ignoring.
Jan 02 14:40:03 serv systemd[1]: zfs-zed.service: State 'stop-final-sigterm' timed out. Killing.
Jan 02 14:40:03 serv systemd[1]: zfs-zed.service: Killing process 16833 (zed) with signal SIGKILL.
Jan 02 14:41:33 serv systemd[1]: zfs-zed.service: Processes still around after final SIGKILL. Entering failed mode.
Jan 02 14:41:33 serv systemd[1]: zfs-zed.service: Failed with result 'timeout'.
Jan 02 14:41:33 serv systemd[1]: Stopped ZFS Event Daemon (zed).

As you see it timed out in killing it, though I can understand why.

Now back to the discussion. I suspect the reason my RAIDZ1 pool isn't progressing is due the faulty SLOG device, since when I tried to remove it, it tells me something I unexpected, that is, it claims my pool had encrypted dataset it in, while in reality it doesn't.

So, this SLOG may had some degree of corruption to the point that it is considered a valid SLOG device but the content which the SLOG should replay itself after a catastrophic event such as a forced, unclean reset, isn't. It tried to replay the "encrypted section" of the data but well, it refuses, and thus the dataset is running in circle trying to replay it since it is important for SLOG to replay sequentially for consistency.

I'll have to confess that I did a series of reboot before because I was trying to debug a weird issue where my SCSI drives under SAS controller, keep failing and I had lost one member of the pool not long ago, so there are 3 remaining and it looked quite dangerous to me so I saw this as a sign of something sinister going to happen. I quickly reused my another RAID1/mirror pool (rpool2) and created a big enough ZVOL inside rpool2 to replace it as a member for rpool.

These two drives were originally my 12TB NAS drive that also happened to run ZFS so its a smooth criminal to get into the game. As a summary, what I said is I re-imported an existing ZFS pool and made a ZVOL inside the imported pool as the replacing RAID member for another dying pool.

But the problem is I know ZFS nesting is a bad thing, as I remember someone from ServeTheHome said this would cause ZFS (txg) deadlock and I highly suspect I'm witnessing one such case. Terribly fascinating at the cost my data at neck point. I would never do this again but just hope that in a short-term after I "salvage" all my data (well at this point 3 member pool out of a RAIDZ1 is technically functional thanks to XOR/ECC magic but this SHTF anyway) it will be fine. It was never fine at any point.

0 replies

adrian-stephens · 2021-01-20T10:41:04Z

adrian-stephens
Jan 20, 2021

I am seeing the same problem. Ubuntu 20.04
zfs-0.8.3-1ubuntu12.4
zfs-kmod-0.8.3-1ubuntu12.5

I've never had an encrypted dataset on any pool. This pool did hold some device volumes, but I destroyed them all.
All remaining volumes are filesystem datasets, some of which (but not all) are mounted.

pool: p5
state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
the pool may no longer be accessible by software that does not support
the features. See zpool-features(5) for details.
scan: scrub repaired 0B in 0 days 18:52:17 with 0 errors on Sun Jan 10 19:16:20 2021
config:

    NAME         STATE     READ WRITE CKSUM
    p5           ONLINE       0     0     0
      mirror-0   ONLINE       0     0     0
        sda2     ONLINE       0     0     0
        sdb2     ONLINE       0     0     0
      mirror-1   ONLINE       0     0     0
        sdc2     ONLINE       0     0     0
        sdd2     ONLINE       0     0     0
    logs
      nvme0n1p2  ONLINE       0     0     0

root@shed9:/services# zpool iostat -v p5
capacity operations bandwidth
pool alloc free read write read write

p5 6.51T 7.99T 94 556 732K 11.5M
mirror 3.25T 4.00T 49 261 374K 5.37M
sda2 - - 24 130 182K 2.68M
sdb2 - - 24 130 193K 2.68M
mirror 3.25T 4.00T 45 294 357K 6.05M
sdc2 - - 22 147 181K 3.03M
sdd2 - - 22 146 176K 3.03M
logs - - - - - -
nvme0n1p2 7.26M 18.4G 0 1 286 51.7K

root@shed9:/services# zpool remove p5 nvme0n1p2
cannot remove nvme0n1p2: Mount encrypted datasets to replay logs.

Although, according to the title of this thread, it is about encryption, most of this thread is about an error with a misleading message. With due respect to behlendorf, I don't believe it has been answered.

I need to remove the device holding the log, but am prevented from doing so.

0 replies

mk01 · 2021-05-23T09:41:15Z

mk01
May 23, 2021

@adrian-stephens

same for me, realised just today. pool never had encryption, ashift=12, for all devices.

>>> zpool get feature@encryption     z4t
NAME  PROPERTY            VALUE               SOURCE
z4t   feature@encryption  enabled             local

>>> zdb -C|grep ashift |grep -v 'ashift: 12'
>>>

zfs is version (debian / buster-backports)

zfs-2.0.3-1~bpo10+1

the dbgmsg with full flags and after remove op is full of:

1621762946   dmu.c:455:dmu_spill_hold_existing(): error 2
1621762946   sa.c:369:sa_attr_op(): error 2
1621762946   zfs_dir.c:1154:zfs_get_xattrdir(): error 2
1621762946   dsl_dir.c:1331:dsl_dir_tempreserve_impl(): error 28

all with the same timestamp from begin to end.
otherwise, with flags=0 and dbgmsg enabled, all operations reporting error or err are =0.

pool otherwise claims to be HEALTHY, log device itself too.

0 replies

derkuci · 2021-08-13T18:16:16Z

derkuci
Aug 13, 2021

I have the same issue. Tried the zil_replay_disable trick, also upgraded from 0.8.3 (the stock one on Ubuntu 20.04) to 2.1.0 (from Jonathon F), but always got the same message "Mount encrypted datasets to replay logs."

# zpool status zones
  pool: zones
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
        invalid.  Sufficient replicas exist for the pool to continue
        functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-4J
  scan: resilvered 3.70T in 1 days 01:12:23 with 0 errors on Fri Jul 30 03:48:16 2021
config:

        NAME                      STATE     READ WRITE CKSUM
        zones                     DEGRADED     0     0     0
          raidz2-0                ONLINE       0     0     0
            sda                   ONLINE       0     0     0
            sdb                   ONLINE       0     0     0
            sde                   ONLINE       0     0     0
            sdc                   ONLINE       0     0     0
            sdf                   ONLINE       0     0     0
            sdd                   ONLINE       0     0     0
            sdg                   ONLINE       0     0     0
            sdh                   ONLINE       0     0     0
            sdi                   ONLINE       0     0     0
            sdj                   ONLINE       0     0     0
            sdk                   ONLINE       0     0     0
            sdl                   ONLINE       0     0     0
        logs
          mirror-1                UNAVAIL      0     0     0  insufficient replicas
            5669378028241903533   FAULTED      0     0     0  was /dev/nvme0n1p1
            15971201715656562480  UNAVAIL      0     0     0  was /dev/nvme1n1p1
          nvme0n1p1               ONLINE       0     0     0

errors: No known data errors
# zpool remove zones mirror-1
cannot remove mirror-1: Mount encrypted datasets to replay logs.
# zpool remove zones 6607928010114998283  # the one for nvme0n1p1
cannot remove 6607928010114998283: Mount encrypted datasets to replay logs.

The problem doesn't seem related to the "UNAVAIL" status of the log device.

0 replies

Cannot remove log device from encrypted pool #8748

System information

Describe the problem you're observing

Describe how to reproduce the problem

Include any warning/errors/backtraces from the system logs

Replies: 20 comments

behlendorf May 15, 2019 Maintainer

DanielSmedegaardBuus May 16, 2019 Author

DanielSmedegaardBuus May 16, 2019 Author

behlendorf Nov 13, 2019 Maintainer

prakashsurya Nov 15, 2019 Collaborator

behlendorf Dec 21, 2020 Maintainer

behlendorf
May 15, 2019
Maintainer

DanielSmedegaardBuus
May 16, 2019
Author

DanielSmedegaardBuus
May 16, 2019
Author

behlendorf
Nov 13, 2019
Maintainer

prakashsurya
Nov 15, 2019
Collaborator

behlendorf
Dec 21, 2020
Maintainer