ZFS Commands: Difference between revisions
Wikisailor (talk | contribs) |
Wikisailor (talk | contribs) |
||
| Line 63: | Line 63: | ||
===Active Self-Healing=== | ===Active Self-Healing=== | ||
A "Scrub" is a deep-scan of the entire pool. ZFS reads every single block and compares it against its mathematical checksum. If it finds a mismatch, it repairs the data automatically. | A "Scrub" is a deep-scan of the entire pool. ZFS reads every single block and compares it against its mathematical checksum. If it finds a mismatch, it repairs the data automatically. | ||
Command: | |||
zpool scrub pearpool | zpool scrub pearpool | ||
When to do it: | |||
* Proxmox usually schedules this monthly, but you should run it manually if you experience a hard power cut or suspect a disk is acting up. | |||
* Performance: You can still use the server during a scrub, though disk latency may increase slightly. | |||
Revision as of 21:54, 14 February 2026
Introduction
There are lots of commands that will show and control how the ZFS pools are run on the Proxmox hosts so we need a brief guide to ZFS
What is ZFS
To understand ZFS at a technical level, you must treat it as a combined file system and logical volume manager. This integration allows the file system to be aware of the underlying disk structure, which is the core of its data integrity features.
Physical Disks (vdev members)
The base layer consists of block devices (e.g., /dev/nvme0n1). ZFS uses Copy-on-Write (CoW) at this level. When data is modified, ZFS writes the new data to a new block and then updates the pointers, rather than overwriting the old data. This prevents data corruption during power loss.
A VDEV (Virtual Device) is the mathematical arrangement of physical disks.
- Mirror: Data is written to n disks. Read IOPS scale with the number of disks; write IOPS match a single disk.
- RAID-Z1: Distributes data and a single parity stripe across n+1 disks. It provides n disks of capacity eg 3 16Tb HDs as Z1 will give ((3-1) x 16)32TB usable capacity or 4 x 16TB would give ((4-1) x 16) 48TB.
- VDEV Failure: If a VDEV loses its redundancy (e.g., two disks fail in a RAID-Z1), the entire ZPool is lost because ZFS stripes data across all VDEVs in a pool (RAID 0 across VDEVs).
- Special VDEVs: This is where things like L2ARC (Cache) or ZIL (Log) live. They are "helper" pillars for the main structure but don't contribute to the storage capacity.
ZPool (Storage Pool) The ZPool is a logical aggregation of one or more VDEVs. All VDEVs in a pool share a single "Free Space" map. Adding a new VDEV to a pool increases the pool's capacity and performance (IOPS) immediately.
- You don't "size" a ZFS pool; it is simply the sum of its VDEVs.
- If you add a second Mirror VDEV to an existing Pool, the Pool grows instantly.
How This applies to Pear array of three 16 TB hard drives
- Each of the drives will be shown as about 14.5TB
- The usable capacity will be about 29 TB and can be show with the command
zfs list pearpool
- The vdev will be shown as a raw capacity of 43.7 TB with the command
zpool list pearpool
- To show the individual components of the vdev including the L2ARC cache we use the verbose switch
zpool list -v pearpool
Datasets and Zvols
In the ZFS world, everything sitting on top of a Pool is technically a "Dataset," but they diverge into two distinct types based on how they present storage to the operating system.
ZFS Datasets (The Filesystem Layer)
A "Dataset" (often specifically called a Filesystem Dataset) is a POSIX-compliant filesystem.
- How they behave: They act like high-powered folders. When you create one (e.g., zfs create pearpool/archive), it is instantly mounted as a directory in Linux.
- Property Inheritance: Datasets exist in a tree. If you set compression=lz4 on the parent pool, every dataset you create under it inherits that setting automatically.
- No Fixed Size: Unlike a partition, a dataset has no "size." It simply takes what it needs from the pool's free space. You use Quotas to stop it from eating the whole pool.
- Proxmox Use Case: Proxmox uses Datasets for LXC Containers. Because containers share the host's kernel, they can write directly to a ZFS filesystem. It also uses them for "Directory" storage (to hold ISOs or .qcow2 files).
ZVOLs (The Block Layer)
A ZVOL (ZFS Volume) is a Dataset that represents a Block Device.
- How they behave: Instead of appearing as a folder, a ZVOL appears as a raw disk in /dev/zvol/pearpool/vm-101-disk-0.
- Fixed Size: You must define a size when you create a ZVOL (e.g., zfs create -V 50G pearpool/win11-disk). The OS using it thinks it is a physical 50GB hard drive.
- Abstraction: A ZVOL allows you to run a different filesystem (like NTFS for Windows or EXT4 for Linux VMs) on top of ZFS. ZFS manages the blocks, but the VM manages its own files.
- Proxmox Use Case: This is the default for VMs (KVM). When you create a VM on ZFS storage, Proxmox creates a ZVOL. This allows the VM to treat the storage as a real SCSI/VirtIO hardware disk.
ZFS Health and Maintenance
ZFS is a "self-healing" system, but it requires the administrator to monitor its signals. Because ZFS manages its own redundancy, standard Linux disk tools (like df -h) often provide incomplete information.
Monitoring the "Pulse"
To see if your pool is healthy, you must check the Pool Status. This is your primary diagnostic tool.
- Command:
zpool status pearpool
What to look for:
- STATE: Should always be ONLINE. If it says DEGRADED, a disk has failed or is disconnected.
- READ/WRITE/CKSUM Errors: These numbers should be 0.
- CKSUM (Checksum) Errors: If this is non-zero, ZFS has detected "Silent Data Corruption" and fixed it using the parity/mirror data. It is a warning that a cable or a disk is starting to fail.
Active Self-Healing
A "Scrub" is a deep-scan of the entire pool. ZFS reads every single block and compares it against its mathematical checksum. If it finds a mismatch, it repairs the data automatically. Command:
zpool scrub pearpool
When to do it:
- Proxmox usually schedules this monthly, but you should run it manually if you experience a hard power cut or suspect a disk is acting up.
- Performance: You can still use the server during a scrub, though disk latency may increase slightly.