VMware vSAN
September 2, 2022 — 19:48

Author: silver  Category: storage virtualization  Comments: Off

vSAN or "Virtual SAN" is a SDS or hyper-converged infrastructure solution from VMware. It’s an alternative for FC or iSCSI based shared storage. It uses a distributed file system called vDFS on top of object storage. Management is done with vCenter. For details there’s the VMware docs and of course there are already many blogs written over the years. What follows below are a few things to take into consideration from an architecture and operations PoV.

Use case

Small to mid sized environments (single to low double digit hosts) with a shared storage requirement and no available storage expertise, where you want to make full use of the ESXi hosts hardware.

Design considerations

vSAN is not just "storage" but a clustered solution which fully integrates with vSphere.

While COTS hardware can be used, all devices have to certificated for and compatible with vSAN per VMware’s requirements. Storage will usually be (a combination of) SSD and flash based devices and optionally a Cache device.

Network-wise there are no special requirements, besides the usual storage related considerations about latency and bandwidth. Since version v7U2 it is possible to use vSAN over RDMA ("vSANoRDMA") instead of TCP, offering lower latency and higher performance. Besides compatible NICs, RDMA (RoCE v2) requires a network configured for lossless traffic — which most recent network devices should support.

You will need a minimum of 3 hosts. The default setup is a Single site cluster. A stretched cluster configuration is also possible, with 2 sites replicating each others data. As it’s a cluster consider failure domains, split brain/node isolation scenario’s, quorum, witness and FTT (Faults to Tolerate).

Data can be encrypted both in transit and at rest.

Caveats

After enabling vSAN it will take care of HA instead of vSphere HA. This means heartbeats will go over the vSAN Network instead of the usual Management Network.

Although VMware tells you vSAN will work fine without vCenter (which is technically true), you should be prepared to fix issues while VC is unreachable. As there are catch 22 situations where VC has no storage since vsan is unavailable, but you want to use VC to fix it. Which could mean you have to (temporary) setup a new vCenter or use the cli.

While it’s possible to directly store files on a vSAN backed datastore, you should instead setup "File services" which offers SMB and NFS shares. It does this by placing a VM on each node.

Note that with RDMA, LACP or IP hash based NIC teaming are not supported.

While "selective" data replication using SPBM is possible, this can quickly get complicated (e.g. having to micro manage storage per VM).

Since v7U2 data at rest encryption can be setup without an external KMS, by using vSphere as Native Key Provider (NKP). Besides having an sane key management policy, this requires UEFI Secure Boot and TPM 2.0 to be enabled on the hosts first.

Before you can Enter Maintenance Mode on a host, as an extra step vSAN might need to migrate data to remaining hosts in the cluster.

Data integrity has been excellent for me, but accessibility? Maybe not so much. I did tests with lost network connectivity, removed hosts from the cluster and other methods of breaking the cluster by changing vsan configuration and removing/redeploying vCenter. I was always able to get full access back to the vsan datastore without any data corruption. However, this took considerable amounts of time using esx cli rebuilding the cluster meaning downtime because VM’s had no storage.

mergerfs
September 6, 2019 — 18:36

Author: silver  Category: linux storage  Comments: Off

Union filesystem (FUSE) like unionfs, aufs and mhddfs. Merge multiple paths and mount them, similar to concatenating.

Get it here: https://github.com/trapexit/mergerfs or from OS package repository.

Compared to (older) alternatives mergerfs seems very stable over the past months I’ve been using it. It offers multiple options on how to spread the data over the used drives.

Optionally SnapRAID can be used to add parity disk(s) to protect against disk failures (https://www.snapraid.it).

Create/mount pool

Example using 5 devices /dev/sd[b-f]

Disks are already partitioned and have a fs

for i in {b..f}; do
  mkdir /mnt/sd${i}1
  mount /dev/sd${i}1 /mnt/sd${i}1 && \
  mkdir /mnt/sd${i}1/mfs
done && \
mkdir /mnt/mergerfs && \
mergerfs -o defaults,allow_other,use_ino /mnt/sd*/mfs /mnt/mergerfs

And here’s the result from ‘df’:

/dev/mapper/sdb1             3.6T  100M  3.5T  1% /mnt/sdb1
/dev/mapper/sdc1             3.6T  100M  3.5T  1% /mnt/sdc1
/dev/mapper/sdd1             3.6T  100M  3.5T  1% /mnt/sdd1
/dev/mapper/sde1             3.6T  100M  3.5T  1% /mnt/sde1
/dev/mapper/sdf1             3.6T  100M  3.5T  1% /mnt/sdf1
mergerfs                      18T  500M  8.5T  1% /mnt/mergerfs

Changing pool

remove old drive from mergerfs pool

xattr -w user.mergerfs.srcmounts -/mnt/data1 /mnt/pool/.mergerfs

add new drive

xattr -w user.mergerfs.srcmounts +/mnt/data4 /mnt/pool/.mergerfs

some other mount options (-o)

  • use_ino make mergerfs supply inodes
  • fsname=example-name name in df
  • no_splice_write fixes page errors in syslog

https://github.com/trapexit/mergerfs#mount-options

Pool info

xattr -l /mnt/mergerfs/.mergerfs

Tools

https://github.com/trapexit/mergerfs-tools

  • mergerfs.balance
  • mergerfs.consolidate
  • mergerfs.ctl
  • mergerfs.dedup
  • mergerfs.dup
  • mergerfs.fsck
  • mergerfs.mktrash

mergerfs.ctl

mergerfs.ctl -m /mnt/mergerfs info
mergerfs.ctl -m /mnt/mergerfs list values
mergerfs.ctl -m /mnt/mergerfs remove path /mnt/data1
mergerfs.ctl -m /mnt/mergerfs add path /mnt/data4

ZFS
March 21, 2015 — 15:40

Author: silver  Category: storage  Tags: , ,   Comments: Off

New zpool:

zpool create data /dev/aacd0p1.eli
zpool add data cache ada1p2
zpool add data log ada1p1

Tuning (bsd):

zboot/loader.conf
/boot/loader.conf

zfs_load="YES"

1G:


vm.kmem_size_max="1073741824"
vm.kmem_size="1073741824"

330M:


vm.kmem_size="330M"
vm.kmem_size_max="330M"


vfs.zfs.arc_max="40M"
vfs.zfs.vdev.cache.size="5M"

Send/receive using SSH:

LVM
July 9, 2014 — 15:09

Author: silver  Category: linux storage  Comments: Off

Resize:

vgextend vg_name /dev/sdb1
lvcreate -n /dev/VolGroup/lv_pstorage -l 100%FREE
lvresize --size -8G /dev/VolGroup/lv_root
lvresize --size -35G /dev/VolGroup/lv_vz
lvresize --size -5G /dev/VolGroup/lv_pstorage
lvresize --size +5G /dev/VolGroup/lv_vz
lvextend -l +100%FREE /dev/centos/data

(after extend: resize2fs)

Rescue:

Boot your rescue media.
Scan for volume groups:

# lvm vgscan -v

Activate all volume groups:

# lvm vgchange -a y

List logical volumes:

# lvm lvs –all

With this information, and the volumes activated, you should be able to mount the volumes:

# mount /dev/volumegroup/logicalvolume /mountpoint







We use Matomo free and open source web analytics
We also use Jetpack WordPress.com Stats which honors DNT