How VMware snapshots work and best practices

What is a snapshot?

A snapshot preserves the state and data of a virtual machine at a specific point in time.

  • The state includes the virtual machine’s power state (for example, powered-on, powered-off, suspended).
  • The data includes all of the files that make up the virtual machine. This includes disks, memory, and other devices, such as virtual network interface cards.

A virtual machine provides several operations for creating and managing snapshots and snapshot chains. These operations let you create snapshots, revert to any snapshot in the chain, and remove snapshots. You can create extensive snapshot trees.

How do snapshots work?

Our VMware API allows VMware and third-party products to perform operations with virtual machines and their snapshots. This is a list of common operations that can be performed on virtual machines and snapshots using our API:

  • CreateSnapshot: Creates a new snapshot of a virtual machine. As a side effect, this updates the current snapshot.
  • RemoveSnapshot: Removes a snapshot and deletes any associated storage.
  • RemoveAllSnapshots: Remove all snapshots associated with a virtual machine. If a virtual machine does not have any snapshots, then this operation simply returns successfully.
  • RevertToSnapshot: Changes the execution state of a virtual machine to the state of this snapshot. This is equivalent to the Go To option under the Snapshot Manager while using vSphere/VI client GUI.
  • Consolidate: Merges the hierarchy of redo logs. This is available in vSphere 5.0 and later.

This is a high-level overview of how to create, remove, or revert snapshot requests that are processed within the VMware environment:

  1. A request to create, remove, or revert a snapshot for a virtual machine is sent from the client to the server using the VMware API.
  2. The request is forwarded to the VMware ESX host that is currently hosting the virtual machine that has issue.

    Note: This only occurs if the original request was sent to a different server, such as vCenter, which is managing the ESX host.

  3. If the snapshot includes the memory option, the ESX host writes the memory of the virtual machine to disk.
  4. If the snapshot includes the quiesce option, the ESX host requests the guest operating system to quiesce the disks via VMware Tools.
  5. The ESX host makes the appropriate changes to the virtual machine’s snapshot database (.vmsd file) and the changes are reflected in the Snapshot Manager of the virtual machine.
  6. The ESX host calls a function similar to the Virtual Disk API functions to make changes to the child disks (-delta.vmdk and .vmdk files) and the disk chain.

When a snapshot is created, it is comprised of these files:

  • -.vmdk and –delta.vmdk

    A collection of .vmdk and -delta.vmdk files for each virtual disk is connected to the virtual machine at the time of the snapshot. These files can be referred to as child disks, redo logs, or delta links. These child disks can later be considered parent disks for future child disks. From the original parent disk, each child constitutes a redo log pointing back from the present state of the virtual disk, one step at a time, to the original.

    Note:

    • The value may not be consistent across all child disks from the same snapshot. The file names are chosen based on filename availability.
    • If the virtual disk is larger than 2TB in size, the redo log file is of –sesparse.vmdk format.
  • .vmsd

    The .vmsd file is a database of the virtual machine’s snapshot information and the primary source of information for the Snapshot Manager. The file contains line entries which define the relationships between snapshots as well as the child disks for each snapshot.

  • Snapshot.vmsn

    The .vmsn file includes the current configuration and optionally the active state of the virtual machine. Capturing the memory state of the virtual machine lets you revert to a turned on virtual machine state. With nonmemory snapshots, you can only revert to a turned off virtual machine state. Memory snapshots take longer to create than nonmemory snapshots.

The disk chain

Generally, when you create a snapshot for the first time, the first child disk is created from the parent disk. Successive snapshots generate new child disks from the last child disk on the chain. The relationship can change if you have multiple branches in the snapshot chain.

This diagram is an example of a snapshot chain. Each square represents a block of data or a grain as described in the preceding section:

Follow these best practices when using snapshots in the vSphere environment:

  • Do not use snapshots as backups.

    The snapshot file is only a change log of the original virtual disk, it creates a place holder disk, virtual_machine-00000x-delta.vmdk, to store data changes since the time the snapshot was created. If the base disks are deleted, the snapshot files are not sufficient to restore a virtual machine.

  • Maximum of 32 snapshots are supported in a chain. However, for a better performance use only 2 to 3 snapshots.
  • Do not use a single snapshot for more than 72 hours.

    The snapshot file continues to grow in size when it is retained for a longer period. This can cause the snapshot storage location to run out of space and impact the system performance.

  • When using a third-party backup software, ensure that snapshots are deleted after a successful backup.

    Note: Snapshots taken by third party software (through API) may not appear in the Snapshot Manager. Routinely check for snapshots through the command-line.

  • Ensure that there are no snapshots before:
    • Performing Storage vMotion in vSphere 4.x and earlier environments.

      Note: vSphere 5.0 and later support Storage vMotion with snapshots on a virtual machine.

    • Increasing the virtual machine disk size or virtual RDM.

      Increasing the disk size when snapshots are still available can corrupt snapshots and result in data loss.

Restarting Management Services of host , how it helps ??

Now, why do we need to restart the management agents of the host.

We need to understand the impact also when and host to do that.It is a easy task to just restart the services. We need to understand is it necessary to do that or its just to save the time and get back to production .

Symptoms:

+++++++++

  • Cannot connect directly to the ESXi host or manage under vCenter server.
  • vCenter Server displays the error:

    Virtual machine creation may fail because agent is unable to retrieve VM creation options from the host

Restart Management agents in ESXi Using Direct Console User Interface (DCUI):

  1. Connect to the console of your ESXi host.
  2. Press F2 to customize the system.
  3. Log in as root.
  4. Use the Up/Down arrows to navigate to Troubleshooting Options > Restart Management Agents.
  5. Press Enter.
  6. Press F11 to restart the services.
  7. When the service restarts, press Enter.
  8. Press Esc to log out.
Restart Management agents in ESXi Using ESXi Shell or Secure Shell (SSH):
  1. Log in to ESXi Shell or SSH as root.

    For Enabling ESXi Shell or SSH, see Using ESXi Shell in ESXi 5.x and 6.x (2004746).

  2. Restart the ESXi host daemon and vCenter Agent services using these commands:

    /etc/init.d/hostd restart

    /etc/init.d/vpxa restart

Note: In ESXi 4.x, run this command to restart the vpxa agent:

service vmware-vpxa restart

Alternatively:

  • To reset the management network on a specific VMkernel interface, by default vmk0, run the command:

    esxcli network ip interface set -e false -i vmk0; esxcli network ip interface set -e true -i vmk0

    Note: Using a semicolon (;) between the two commands ensures the VMkernel interface is disabled and then re-enabled in succession. If the management interface is not running on vmk0, change the above command according to the VMkernel interface used.

  • To restart all management agents on the host, run the command:

    services.sh restart

Now we know how to do it.. Lets see what might it look like if the hostd and the vpxa service is hung. There might me lot of zombie processes for hostd which might cause this.

Eg:

You can try to kill the hostd process first if there is only a process, if there are multiple I would recommend to reboot the host, rather than creating another zombie process. It goes same with vpxa process.

You see these errors:
  • VPXA log errors:

    Authd error: 514 Error connecting to hostd-vmdb service instance.
    Failed to connect to host :902. Check that authd is running correctly (lib/connect error 11).

  • Error in hostd.log

    2017-11-27T19:57:41.000Z [282DFB70 info ‘Vimsvc.ha-eventmgr’] Event 7856 : Issue detected on test.local in ha-datacenter: hostd detected to be non-responsive

It depends on the logging what needs to be done on the restart of agents or restarting the individual service.

You guys would be better to judge on the reboot of host would be a better Idea or a restart of agent.

If you see few of these errors as below :

  • The ESXi host appears as disconnected in vCenter Server
  • Attempting to reconnect the host fails
  • You see the error:
A general system error occurred: Timed waiting for vpxa to start

To resolve this issue:

  1. Restart the Management agents on host.
  2. Log in to the host directly using the vSphere Client.
  3. Right-click the virtual machine and click Remove from Inventory.
  4. Restart the Management agents on the host.
  5. Right-click the host in vCenter Server and click Reconnect.

 

I hope this helps 🙂

 

Few Storage related know issues

This might be useful to all the guys who deal with storage related issues day in an out:

VMFS Issues

  • After failed attempts to grow a VMFS datastore, VIM APIs information and LVM information on the system is inconsistent 
    This problem occurs when you attempt to grow the datastore while the backing SCSI device enters the APD or PDL state. As a result, you might observe inconsistent information in VIM APIs and LVM commands on the host.Workaround: Perform these steps:

    1. Run the vmkfstools --growfs command on one of the hosts connected to the volume.
    2. Perform the rescan-vmfs operation on all host connected to the volume.
  • VMFS6 datastore does not support combining 512n and 512e devices in the same datastore 
    You can expand a VMFS6 datastore only with the devices of the same type. If the VMFS6 datastore is backed by a 512n device, expand the datastore with the 512n devices. If the datastore is created on a 512e device, expand the datastore with 512e devices.Workaround: None.
  • ESXi does not support the automatic space reclamation on arrays with unmap granularity greater than 1 MB
    If the unmap granularity of the backing storage is greater than 1 MB, the unmap requests from the ESXi host are not processed. You can see the Unmap not supported message in the vmkernel.log file.Workaround: None.
  • Using storage rescan in environments with the large number of LUNs might cause unpredictable problems 
    Storage rescan is an IO intensive operation. If you run it while performing other datastore management operation, such as creating or extending a datastore, you might experience delays and other problems. Problems are likely to occur in environments with the large number of LUNs, up to 1024, that are supported in the vSphere 6.5 release.Workaround: Typically, storage rescans that your hosts periodically perform are sufficient. You are not required to rescan storage when you perform the general datastore management tasks. Run storage rescans only when absolutely necessary, especially when your deployments include a large set of LUNs.

NFS Issues

  • The NFS 4.1 client loses synchronization with the NFS server when attempting to create new sessions 
    This problem occurs after a period of interrupted connectivity with the NFS server or when NFS IOs do not get response. When this issue occurs, the vmwarning.log file contains a throttled series of warning messages similar to the following:
    NFS41 CREATE_SESSION request failed with NFS4ERR_SEQ_MISORDEREDWorkaround: Perform the following steps:

    1. Unmount the affected NFS 4.1 datastores. If no files are open when you unmount, this operation succeeds and the NFS 4.1 client module cleans up its internal state. You can then remount the datastores that were unmounted and resume normal operation.
    2. If unmounting the datastore does not solve the problem, disable the NICs connecting to the IP addresses of the NFS shares. Keep the NICs disabled for as long as it is required for the server lease times to expire, and then bring the NICs back up. Normal operations should resume.
    3. If the preceding steps fail, reboot the ESXi host.
  • After an ESXi reboot, NFS 4.1 datastores exported by EMC VNX storage fail to mount 
    Due to a potential problem with EMC VNX, NFS 4.1 remount requests might fail after an ESXi host reboot. As a result, any existing NFS 4.1 datastores exported by this storage appear as unmounted.Workaround: Wait for the lease time of 90 seconds to expire and manually remount the volume.
  • Mounting the same NFS datastore with different labels might trigger failures when you attempt to mount another datastore later 
    The problem occurs when you use the esxcli command to mount the same NFS datastore on different ESXi hosts. If you use different labels, for example A and B, vCenter Server renames B to A, so that the datastore has consistent labels across the hosts. If you later attempt to mount a new datastore and use the B label, your ESXi host fails. This problem occurs only when you mount the NFS datastore with the esxcli command. It does not affect mounting through the vSphere Web Client.Workaround: When mounting the same NFS datastore with the esxcli commands, make sure to use consistent labels across the hosts.
  • An NFS 4.1 datastore exported from a VNX server might become inaccessible 
    When the VNX 4.1 server disconnects from the ESXi host, the NSF 4.1 datastore might become inaccessible. This issue occurs if the VNX server unexpectedly changes its major number. However, the NFS 4.1 client does not expect the server major number to change after establishing connectivity with the server.Workaround: Remove all datastores exported by the server and then remount them.

Virtual Volumes Issues

  • After upgrade from vSphere 6.0 to vSphere 6.5, the Virtual Volumes storage policy might disappear from the VM Storage Policies list 
    After you upgrade your environment to vSphere 6.5, the Virtual Volumes storage policy that you created in vSphere 6.0 might no longer be visible in the list of VM storage policies.Workaround: Log out of the vSphere Web Client, and then log in again.
  • The vSphere Web Client fails to display information about the default profile of a Virtual Volumes datastore
    Typically, you can check information about the default profile associated with the Virtual Volumes datastore. In the vSphere Web Client, you do it by browsing to the datastore, and then clicking Configure > Settings > Default Profiles.
    However, the vSphere Web Client is unable to report the default profiles when their IDs, configured at the storage side, are not unique across all the datastores reported by the same Virtual Volumes provider.Workaround: None.

iSCSI Issues

  • In vSphere 6.5, the name assigned to the iSCSI software adapter is different from the earlier releases
    After you upgrade to the vSphere 6.5 release, the name of the existing software iSCSI adapter, vmhbaXX, changes. This change affects any scripts that use hard-coded values for the name of the adapter. Because VMware does not guarantee that the adapter name remains the same across releases, you should not hard code the name in the scripts. The name change does not affect the behavior of the iSCSI software adapter.Workaround: None.

Storage Host Profiles Issues

  • Attempts to set the action_OnRetryErrors parameter through host profiles fail 
    This problem occurs when you edit a host profile to add the SATP claim rule that activates the action_OnRetryErrors setting for NMP devices claimed by VMW_SATP_ALUA. The setting controls the ability of an ESXi host to mark a problematic path as dead and trigger a path failover. When added through the host profile, the setting is ignored.Workaround: You can use two alternative methods to set the parameter on a reference host.

    • Use the following esxcli command to enable or disable the action_OnRetryErrors parameter:
      esxcli storage nmp satp generic deviceconfig set -c disable_action_OnRetryErrors -d naa.XXX 
      esxcli storage nmp satp generic deviceconfig set -c enable_action_OnRetryErrors -d naa.XXX
    • Perform these steps:
      1. Add the VMW_SATP_ALUA claimrule to the SATP rule:
        esxcli storage nmp satp rule add --satp=VMW_SATP_ALUA --option=enable_action_OnRetryErrors --psp=VMW_PSP_XXX --type=device --device=naa.XXX
      2. Run the following commands to reclaim the device:
        esxcli storage core claimrule load
        esxcli storage core claiming reclaim -d naa.XXX

VM Storage Policy Issues

  • Hot migrating a virtual machine with vMotion across vCenter Servers might change the compliance status of a VM storage policy 
    After you use vMotion to perform a hot migration of a virtual machine across vCenter Servers, the VM Storage Policy compliance status changes to UNKNOWN.Workaround: Check compliance on the migrated virtual machine to refresh the compliance status.

    1. In the vSphere Web Client, browse to the virtual machine.
    2. From the right-click menu, select VM Policies > Check VM Storage Policy Compliance.
      The system verifies the compliance.

Storage Driver Issues

  • The bnx2x inbox driver that supports the QLogic NetXtreme II Network/iSCSI/FCoE adapter might cause problems in your ESXi environment 
    Problems and errors occur when you disable or enable VMkernel ports and change the failover order of NICs for your iSCSI network setup.Workaround: Replace the bnx2x driver with an asynchronous driver. For information, see the VMware Web site.
  • The ESXi host might experience problems when you use Seagate SATA storage drives
    If you use an HBA adapter that is claimed by the lsi_msgpt3 driver, the host might experience problems when connecting to the Seagate SATA devices. The vmkernel.log file displays errors similar to the following:
    SCSI cmd RESERVE failed on path XXX
    and
    reservation state on device XXX is unknownWorkaround: Replace the Seagate SATA drive with another drive.
  • When you use the Dell lsi_mr3 driver version 6.903.85.00-1OEM.600.0.0.2768847, you might encounter errors
    If you use the Dell lsi_mr3 asynchronous driver version 6.903.85.00-1OEM.600.0.0.2768847, the VMkernel logs might display the following message ScsiCore: 1806: Invalid sense buffer.Workaround: Replace the driver with the vSphere 6.5 inbox driver or an asynchronous driver from Broadcom.

Boot from SAN Issues

  • Installing ESXi 6.5 on a Fibre Channel or iSCSI LUN with LUN ID greater than 255 is not supported 
    vSphere 6.5 supports LUN IDs from 0 to 16383. However, due to adapter BIOS limitations, you cannot use LUNs with IDs greater than 255 for the boot from SAN installation.Workaround: For ESXi installation, use LUNs with IDs 255 or lower.

Miscellaneous Storage Issues

  • If you use SESparse VMDK, formatting of a VM with Windows or Linux file system takes longer 
    When you format a VM with Windows or Linux file system, the process might take longer than usual. This occurs if the virtual disk is SESparse.Workaround: Before formatting, disable the UNMAP operation on the guest operating system. You can re-enable the operation after the formatting process completes.
  • Attempts to use the VMW_SATP_LOCAL plug-in for shared remote SAS devices might trigger problems and failures
    In releases earlier than ESX 6.5, the SAS devices are marked as remote despite being claimed by the VMW_SATP_LOCAL plug-in. In ESX 6.5, all devices claimed by VMW_SATP_LOCAL are marked as local even when they are external. As a result, when you upgrade to ESXi 6.5 from earlier releases, any of your existing remote SAS devices that were previously marked as remote change their status to local. This change affects shared datastores deployed on these devices and might cause problems and unpredictable behavior.
    In addition, problems occur if you incorrectly use the devices that are now marked as local, but are in fact shared and external, for certain features. For example, when you allow creation of the VFAT file system, or use the devices for Virtual SAN.Workaround: Do not use the VMW_SATP_LOCAL plug-in for the remote external SAS devices. Make sure to use other applicable SATP from the supported list or a vendor unique SATP.
  • Logging out of the vSphere Web Client while uploading a file to a datastore cancels the upload and leaves an incomplete file 
    Uploading large files to a datastore takes some time. If you log out while uploading the file, the upload is cancelled without warning. The partially uploaded file might remain on the datastore.Workaround: Do not log out during file uploads. If the datastore contains the incomplete file, manually delete the file from the datastore.

Storage I/O Control Issues

  • You cannot change VM I/O filter configuration during cloning
    Changes to a virtual machine’s policies during cloning is not supported by Storage I/O Control.Workaround: Perform the clone operation without any policy change. You can update the policy after completing the clone operation.
  • Storage I/O Control settings are not honored per VMDK
    Storage I/O Control settings are not honored on a per VMDK basis. The VMDK settings are honored at the virtual machine level.Workaround: None.

Storage DRS Issues

  • Storage DRS does not honor Pod-level VMDK affinity if the VMDKs on a virtual machine have a storage policy attached to them 
    If you set a storage policy on the VMDK of a virtual machine that is part of a datastore cluster with Storage DRS enabled, then Storage DRS does not honor the Keep VMDKs together flag for that virtual machine. It might recommend different datastores for newly added or existing VMDKs.Workaround: None. This behavior is observed when you set any kind of policy such as VMCrypt or tag-based policies.
  • You cannot disable Storage DRS when deploying a VM from an OVF template
    When you deploy an OVF template and select an individual datastore from a Storage DRS cluster for the VM placement, you cannot disable Storage DRS for your VM. Storage DRS remains enabled and might later move this VM to a different datastore.Workaround: To permanently keep the VM on the selected datastore, manually change the automation level of the VM. Add the VM to the VM overrides list from the storage cluster settings.

Backup and Restore Issues

  • New After file-based restore of а vCenter Server Appliance to a vCenter Server instance, operations in the vSphere Web Client such as configuring high availability cluster or enabling SSH access to the appliance may result with failure 
    In the process of restoring a vCenter Server instance, a new vCenter Server Appliance is deployed and the appliance HTTP server is started with self-signed certificate. The restore process completes with recovering the backed up certificates but without restarting the appliance HTTP server. As a result, any operation which requires to make an internal API call to the appliance HTTP server fails.Workaround: After restoring the vCenter Server Appliance to a vCenter Server instance, you must login to the appliance and restart its HTTP server by running the command service vami-lighttp restart.
  • Attempts to restore a Platform Services Controller appliance from a file-based backup fail if you have changed the number of vCPUs or the disk size of the appliance
    In vSphere 6.5, the Platform Services Controller appliance is deployed with 2 vCPUs and 60 GB disk size. Increasing the number of vCPUs and the disk size is unsupported. If you try to perform a file-based restore of a Platform Services Controller appliance with more than 2 CPUs or 60 GB disk size, the vCenter Server Appliance installer fails with the error: No possible size matches your set of requirements.Workaround: Decrease the number of the processors to no more than 2 vCPUs and the disk size to no more than 60 GB.
  • Restoring a vCenter Server Appliance with an external Platform Services Controller from an image-based backup does not start all vCenter Server services
    After you use vSphere Data Protection to restore a vCenter Server Appliance with an external Platform Services Controller, you must run the vcenter-restore script to complete the restore operation and start the vCenter Server services. The vcenter-restore execution might fail with the error message: Operation Failed. Please make sure the SSO username and password are correct and rerun the script. If problem persists, contact VMware support.Workaround: After the vcenter-restore execution has failed, run the service-control --start --all command to start all services.

    If the service-control --start --all execution fails, verify that you entered the correct vCenter Single Sign-On user name and password. You can also contact VMware Support.

Reference : https://docs.vmware.com/en/VMware-vSphere/6.5/rn/vsphere-vcenter-server-651-release-notes.html

PowerShell Modules for Rubrik

So once you connect to the Rubrik Module :

Requirements:

  1. PowerShell version 4 or higher
  2. PowerCLI version 6.0 or higher
  3. Rubrik version 2.2 or higher
  4. (optional) Windows Management Framework 5.0 

Installation :

+++++++++

  1. Ensure you have the Windows Management Framework 5.0 or greater installed.
  2. Open a Powershell console with the Run as Administrator option.
  3. Run Set-ExecutionPolicy using the parameter RemoteSigned or Bypass.
  4. Run Install-Module -Name Rubrik -Scope CurrentUser to download the module from the PowerShell Gallery. Note that the first time you install from the remote repository it may ask you to first trust the repository.

Option 2: Installer Script

  1. Download the master branch to your workstation.
  2. Open a Powershell console with the Run as Administrator option.
  3. Run Set-ExecutionPolicy using the parameter RemoteSigned or Bypass.
  4. Run the Install-Rubrik.ps1 script in the root of this repository and follow the prompts to install, upgrade, or delete your Rubrik Module contents.

Than lets see some commands:

 

These are the list of commands which you can execute from PowerShell.

Now lets start with how we connect :

So you get connected to the cluster. Once you are connected:

When we run this we see multiple instances loaded on the cluster .

Lets check the list of filesets now:

 

It gives you amazing set of information which you can customize accordingly as in how you need them in your environment.

You can use : Get-Command -Module Rubrik command to explore the list on Rubrik Module

Please comment on this article if you need any information regarding the powershell execution against Rubrik Module.

Boot Device for VSAN

Starting an ESXi installation that is a part of a vSAN cluster from a flash device imposes certain restrictions.

When you boot a vSAN host from a USB/SD device, you must use a high-quality USB or SD flash drive of 4 GB or larger.

When you boot a vSAN host from a SATADOM device, you must use single-level cell (SLC) device. The size of the boot device must be at least 16 GB.

During installation, the ESXi installer creates a coredump partition on the boot device. The default size of the coredump partition satisfies most installation requirements. (You need to configure the core dump partition)

  • If the memory of the ESXi host has 512 GB of memory or less, you can boot the host from a USB, SD, or SATADOM device.

  • If the memory of the ESXi host has more than 512 GB, consider the following guidelines.

    • You can boot the host from a SATADOM or disk device with a size of at least 16 GB. When you use a SATADOM device, use a single-level cell (SLC) device.

    • If you are using vSAN 6.5 or later, you must resize the coredump partition on ESXi hosts to boot from USB/SD devices. For more information, see the VMware knowledge base article at http://kb.vmware.com/kb/2147881.

Hosts that boot from a disk have a local VMFS. If you have a disk with VMFS that runs VMs, you must separate the disk for an ESXi boot that is not for vSAN. In this case you need separate controllers.

Best Practices VSAN Networking

Consider networking best practices for vSAN to improve performance and throughput.

  • For hybrid configurations, dedicate at least 1-GbE physical network adapter. Place vSAN traffic on a dedicated or shared 10-GbE physical adapter for best networking performance.

  • For all-flash configurations, use a dedicated or shared 10-GbE physical network adapter.

  • Provision one additional physical NIC as a failover NIC.

  • If you use a shared 10-GbE network adapter, place the vSAN traffic on a distributed switch and configure Network I/O Control to guarantee bandwidth to vSAN.

Monitor the Resynchronization Tasks in the vSAN Cluster

To evaluate the status of objects that are being resynchronized, you can monitor the resynchronization tasks that are currently in progress.

Prerequisites

Verify that hosts in your vSAN cluster are running ESXi 6.5 or later.

Procedure

  1. Navigate to the vSAN cluster in the vSphere Web Client.
  2. Select the Monitor tab and click vSAN.
  3. Select Resyncing Components to track the progress of resynchronization of virtual machine objects and the number of bytes that are remaining before the resynchronization is complete.

    NOTE: If your cluster has connectivity issues, the data on the Resyncing Components page might not get refreshed as expected and the fields might reflect inaccurate information.

Maintenance Mode on VSAN

Any maintenance activity on ESXi host running VSAN, the first thing you will want to do is to place the host into Maintenance Mode. If you have never performed this operation on a VSAN host before, you should be aware that there is a new option to specify how the data for VSAN will be migrated. Below is a screenshot of three options provided when using the vSphere Web Client.

Procedure:

  1. Right-click the host and select Maintenance Mode > Enter Maintenance Mode.
  2. Select a data evacuation mode and click OK.

Ensure data accessibility from other hosts:

++++++++++++++++++++++++++++++

This is the default option. When you power off or remove the host from the cluster, vSAN ensures that all accessible virtual machines on this host remain accessible. Select this option if you want to take the host out of the cluster temporarily, for example, to install upgrades, and plan to have the host back in the cluster. This option is not appropriate if you want to remove the host from the cluster permanently.

Evacuate all data to other hosts:

+++++++++++++++++++++++++

vSAN evacuates all data to other hosts in the cluster, maintains or fixes availability compliance for the affected components, and protects data when sufficient resources exist in the cluster. Select this option if you plan to migrate the host permanently. When evacuating data from the last host in the cluster, make sure that you migrate the virtual machines to another datastore and then place the host in maintenance mode.

This evacuation mode results in the largest amount of data transfer and consumes the most time and resources. All the components on the local storage of the selected host are migrated elsewhere in the cluster. When the host enters maintenance mode, all virtual machines have access to their storage components and are still compliant with their assigned storage policies.

No data evacuation:

+++++++++++++++++

vSAN does not evacuate any data from this host. If you power off or remove the host from the cluster, some virtual machines might become unaccessible.

How to move vSAN Datastore into a Folder?

vSphere Folders are commonly used by administrators for organizational purposes and/or permission delegation. When the customer tried to move their vSAN datastore into a folder using the vSphere Web Client (applies to HTML5 Web Client as well), what they found was that nothing happens even though the UI indicates the operation should be possible with the (+) symbol.

I decided to perform the operation using the vSphere API instead of the UI. Behind the scenes, the UI simply calls the MoveIntoFolder_Task() vSphere API which allows you to move various vSphere Inventory objects into a vSphere Folder.

For PowerCLI users, we can use Move-Datastore cmdlet which I will be using for this:

In my setup, I have one vSAN Datastores, one from a vSphere 6.0u3 environment . Lets say I want to move the 60u3 datastore to TEST. The following PowerCLI snippet below does exactly that:

Move-Datastore -Datastore (Get-Datastore "vsanDatastore") -Destination (Get-Folder "TEST")

You can see the TEST datastore .vSAN Datastores is now part of a vSphere Folder!

For now, if you need to move vSAN-based datastore into a vSphere Folder, simply use the vSphere API as a workaround.