ZeroStack provides high availability at several layers: control plane, network, storage, VM, and multi-region.
Control Plane HA
ZeroStack’s HA model uses a controller-less architecture. There is no need to configure HA on separate controller hosts. The ZeroStack control plane is distributed and is designed to be self-healing and highly available from the ground up.
When any host in the system goes down, the control plane automatically migrates services to another host without any downtime to the workloads running on the cluster.
ZeroStack provides replicated SSD and HDD pools for data protection. These pools are built using local disks across multiple compute hosts and replication is done so as to protect against both host and disk failures. We use replica count of 3x objects, the failure of a host will retain full storage replication capability, and can still survive one more host failure before reaching “degraded mode”. Of course, the 3x replica count also provides full protection against one disk failure and can still survive another disk failure without loss of data.
ZeroStack also provides VM high-availability, which protects user workloads and protects user VMs from becoming unavailable when hosts fail.
In the event of a host failure, ZeroStack automatically brings up the VM (tagged HA) on another available host.
ZeroStack users can now create Highly available VMs by simply marking a checkbox in the UI. This guarantees the availability of the VM in the event of hardware or VM failure. VMs marked for HA must use one of the shared storage pools offered by ZeroStack.
Each host has 2 x 10GBase-T NICs and each NIC is connected to a different ToR switch in an active-active configuration using the Link Aggregation Control Protocol (LACP). Failure, Reboot, or software upgrade of a single ToR switch results in temporary performance degradation instead of an outage.
For more high availability options, ZeroStack’s infrastructure is organized into Availability Zones (AZ) and Regions. Each AZ represents a fault domain, the failure of which doesn’t impact workloads running in another AZ. Similarly, a Region can be a geographically- distributed site that can provide for both disaster recovery as well as better performance and data locality. ZeroStack provides automated placement policies to schedule workloads in a given AZ or region and assigns a given storage pool type based on protection requirements.
Remote replication across sites can be done using external storage partners today. For those countries and dominions where all the customer data and management/operations have to be in-country, ZeroStack can provide the SaaS portal access locally with all the compute and data assets located within the country.