-
I have a fresh new pool that has just finished restoring from a backup, and I'm observing behavior that is not consistent with my understanding of how ZFS works. The pool consists of 4× HDDs in a 4-way raidz1, plus 2× SSDs in a mirror as the special vdev. The pool was created with # zpool get all htank | awk '$2 !~ /feature@/ && $4 !~ /-|default/ { print }'
NAME PROPERTY VALUE SOURCE
htank ashift 12 local
# zfs get -s local all htank
NAME PROPERTY VALUE SOURCE
htank recordsize 1M local
htank mountpoint /mnt/htank local
htank checksum sha256 local
htank compression zstd-11 local
htank atime off local
htank xattr on local
htank dnodesize auto local
htank acltype posix local
htank relatime off local
htank special_small_blocks 128K local According to # zdb -Lbbbs htank
<...>
33.4M 30.1T 27.3T 36.4T 1.09M 1.11 100.00 Total
1.48M 154G 6.82G 14.1G 9.54K 22.65 0.04 Metadata Total
Block Size Histogram
block psize lsize asize
size Count Size Cum. Count Size Cum. Count Size Cum.
512: 1.41M 722M 722M 1.41M 722M 722M 0 0 0
1K: 107K 113M 835M 107K 113M 835M 0 0 0
2K: 42.3K 117M 952M 42.3K 117M 952M 0 0 0
4K: 1.46M 5.85G 6.78G 118K 529M 1.45G 1.60M 6.40G 6.40G
8K: 182K 1.56G 8.34G 46.6K 517M 1.95G 1.48M 12.2G 18.6G
16K: 85.0K 1.85G 10.2G 279K 4.82G 6.77G 188K 3.37G 22.0G
32K: 119K 5.33G 15.5G 66.1K 3.08G 9.85G 117K 5.19G 27.2G
64K: 163K 14.8G 30.3G 64.6K 5.58G 15.4G 180K 16.1G 43.3G
128K: 317K 56.3G 86.6G 1.21M 156G 172G 199K 41.3G 84.6G
256K: 1.90M 852G 939G 35.2K 12.5G 184G 391K 148G 233G
512K: 4.19M 2.98T 3.89T 27.7K 19.3G 204G 3.82M 2.84T 3.07T
1M: 23.4M 23.4T 27.3T 29.9M 29.9T 30.1T 25.4M 33.4T 36.4T
2M: 0 0 27.3T 0 0 30.1T 0 0 36.4T
4M: 0 0 27.3T 0 0 30.1T 0 0 36.4T
8M: 0 0 27.3T 0 0 30.1T 0 0 36.4T
16M: 0 0 27.3T 0 0 30.1T 0 0 36.4T However, # zpool list -v htank
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
htank 58.5T 33.2T 25.3T - - 1% 56% 1.00x ONLINE -
raidz1-0 58.2T 33.2T 25.1T - - 1% 57.0% - ONLINE
htank-1 14.6T - - - - - - - ONLINE
htank-2 14.6T - - - - - - - ONLINE
htank-3 14.6T - - - - - - - ONLINE
htank-4 14.6T - - - - - - - ONLINE
special - - - - - - - - -
mirror-2 254G 41.6G 212G - - 7% 16.4% - ONLINE
htank-special-1 256G - - - - - - - ONLINE
htank-special-2 256G - - - - - - - ONLINE
logs - - - - - - - - -
htank-log-1 7.98G 32K 7.50G - - 0% 0.00% - ONLINE
cache - - - - - - - - -
htank-cache-1 128G 2.55G 125G - - 0% 1.99% - ONLINE According to my understanding, this configuration should have resulted in at least 100G of special vdev usage, because all 87G of blocks whose PSIZE is 128K or less should have ended up on the special vdev. Is either of these reports wrong, or am I misunderstanding the mechanics of this feature? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
As I can see from |
Beta Was this translation helpful? Give feedback.
As I can see from
int bin = highbit64(BP_GET_PSIZE(bp)) - 1
, the histograms inzdb
are rounding block sizes down to the nearest power of 2. So the bins for 128K actually include blocks with size128K <= size < 256K
. Same timespecial_small_blocks=128K
really meanssize <= 128KB
. May be the rounding inzdb
could benefit from a closer look.