2023-05-08 07:55:33 +03:00
|
|
|
=============
|
|
|
|
zoned-storage
|
|
|
|
=============
|
|
|
|
|
|
|
|
Zoned Block Devices (ZBDs) divide the LBA space into block regions called zones
|
|
|
|
that are larger than the LBA size. They can only allow sequential writes, which
|
|
|
|
can reduce write amplification in SSDs, and potentially lead to higher
|
|
|
|
throughput and increased capacity. More details about ZBDs can be found at:
|
|
|
|
|
|
|
|
https://zonedstorage.io/docs/introduction/zoned-storage
|
|
|
|
|
|
|
|
1. Block layer APIs for zoned storage
|
|
|
|
-------------------------------------
|
|
|
|
QEMU block layer supports three zoned storage models:
|
|
|
|
- BLK_Z_HM: The host-managed zoned model only allows sequential writes access
|
|
|
|
to zones. It supports ZBD-specific I/O commands that can be used by a host to
|
|
|
|
manage the zones of a device.
|
|
|
|
- BLK_Z_HA: The host-aware zoned model allows random write operations in
|
|
|
|
zones, making it backward compatible with regular block devices.
|
|
|
|
- BLK_Z_NONE: The non-zoned model has no zones support. It includes both
|
|
|
|
regular and drive-managed ZBD devices. ZBD-specific I/O commands are not
|
|
|
|
supported.
|
|
|
|
|
|
|
|
The block device information resides inside BlockDriverState. QEMU uses
|
|
|
|
BlockLimits struct(BlockDriverState::bl) that is continuously accessed by the
|
|
|
|
block layer while processing I/O requests. A BlockBackend has a root pointer to
|
|
|
|
a BlockDriverState graph(for example, raw format on top of file-posix). The
|
|
|
|
zoned storage information can be propagated from the leaf BlockDriverState all
|
|
|
|
the way up to the BlockBackend. If the zoned storage model in file-posix is
|
|
|
|
set to BLK_Z_HM, then block drivers will declare support for zoned host device.
|
|
|
|
|
|
|
|
The block layer APIs support commands needed for zoned storage devices,
|
|
|
|
including report zones, four zone operations, and zone append.
|
|
|
|
|
|
|
|
2. Emulating zoned storage controllers
|
|
|
|
--------------------------------------
|
|
|
|
When the BlockBackend's BlockLimits model reports a zoned storage device, users
|
|
|
|
like the virtio-blk emulation or the qemu-io-cmds.c utility can use block layer
|
|
|
|
APIs for zoned storage emulation or testing.
|
|
|
|
|
|
|
|
For example, to test zone_report on a null_blk device using qemu-io is::
|
|
|
|
|
|
|
|
$ path/to/qemu-io --image-opts -n driver=host_device,filename=/dev/nullb0 -c "zrp offset nr_zones"
|
2023-05-08 08:19:16 +03:00
|
|
|
|
|
|
|
To expose the host's zoned block device through virtio-blk, the command line
|
|
|
|
can be (includes the -device parameter)::
|
|
|
|
|
|
|
|
-blockdev node-name=drive0,driver=host_device,filename=/dev/nullb0,cache.direct=on \
|
|
|
|
-device virtio-blk-pci,drive=drive0
|
|
|
|
|
|
|
|
Or only use the -drive parameter::
|
|
|
|
|
|
|
|
-driver driver=host_device,file=/dev/nullb0,if=virtio,cache.direct=on
|
|
|
|
|
|
|
|
Additionally, QEMU has several ways of supporting zoned storage, including:
|
|
|
|
(1) Using virtio-scsi: --device scsi-block allows for the passing through of
|
|
|
|
SCSI ZBC devices, enabling the attachment of ZBC or ZAC HDDs to QEMU.
|
|
|
|
(2) PCI device pass-through: While NVMe ZNS emulation is available for testing
|
|
|
|
purposes, it cannot yet pass through a zoned device from the host. To pass on
|
|
|
|
the NVMe ZNS device to the guest, use VFIO PCI pass the entire NVMe PCI adapter
|
|
|
|
through to the guest. Likewise, an HDD HBA can be passed on to QEMU all HDDs
|
|
|
|
attached to the HBA.
|