Voice over IP Traffic Marking

Assign priority tags to traffic, such as VoIP and streaming video, that has higher networking requirements for bandwidth, low latency, and so on. You can mark the traffic with a CoS tag in Layer 2 of the network protocol stack or with a DSCP tag in Layer 3.

Priority tagging is a mechanism to mark traffic that has higher QoS demands. In this way, the network can recognize different classes of traffic. The network devices can handle the traffic from each class according to its priority and requirements.

You can also re-tag traffic to either raise or lower the importance of the flow. By using a low QoS tag, you can restrict data tagged in a guest operating system.

Procedure

  1. Locate a distributed port group or an uplink port group in the vSphere Web Client.
    1. Select a distributed switch and click the Related Objects tab.
    2. Click Distributed Port Groups to see the list of distributed port groups, or click Uplink Port Groups to see the list of uplink port groups.
  2. Right-click the port group and select Edit settings.
  3. Select Traffic filtering and marking.
  4. If traffic filtering and marking is disabled, enable it from the Status drop-down menu.
  5. Click New to create a new rule, or select a rule and click Edit to edit it.
  6. In the network traffic rule dialog box, select the Tag option from the Action drop-down menu.
  7. Set the priority tag for the traffic within the scope of the rule

Specify the kind of traffic that the rule is applicable to.

To determine if a data flow is in the scope of a rule for marking or filtering, the vSphere distributed switch examines the direction of the traffic, and properties like source and destination, VLAN, next level protocol, infrastructure traffic type, and so on.
  1. From the Traffic direction drop-down menu, select whether the traffic must be ingress, egress, or both so that the rule recognizes it as matching.

    The direction also influences how you are going to identify the traffic source and destination.

  2. By using qualifiers for system data type, Layer 2 packet attributes, and Layer 3 packet attributes set the properties that packets must have to match the rule.

    A qualifier represents a set of matching criteria related to a networking layer. You can match traffic to system data type, Layer 2 traffic properties, and Layer 3 traffic properties. You can use the qualifier for a specific networking layer or can combine qualifiers to match packets more precisely.

    • Use the system traffic qualifier to match packets to the type of virtual infrastructure data that is flowing through the ports of the group . For example, you can select NFS for data transfers to network storage.
    • Use the MAC traffic qualifier to match packets by MAC address, VLAN ID, and next level protocol.

      Locating traffic with a VLAN ID on a distributed port group works with Virtual Guest Tagging (VGT). To match traffic to VLAN ID if Virtual Switch Tagging (VST) is active, use a rule on an uplink port group or uplink port.

    • Use the IP traffic qualifier to match packets by IP version, IP address, and next level protocol and port.
    • In the rule dialog box, click OK to save the rule.

 

Voice over IP (VoIP) flows have special requirements for QoS in terms of low loss and delay. The traffic related to the Session Initiation Protocol (SIP) for VoIP usually has a DSCP tag equal to 26, which stands for Assured Forwarding Class 3 with Low Drop Probability (AF31).

For example, to mark outgoing SIP UDP packets to a subnet 192.168.2.0/24, you can use the following rule:

 

 

 

Lost redundant path to storage device & Path redundancy to the storage device is degraded

Lost redundant path to storage device:
  • ESX host has lost all the redundant paths to a storage device and there is only one active path left
  • The Task and Events tab of the ESX host on the vSphere Client reports this error under events:

    Lost path redundancy to storage device DeviceName. Path vmhbaN:Cx:Ty:Lz is down. Affected datastores: DatastoreName

  • The ESX host and virtual machines can still access the storage device through the single active path that remains
Solution

If vmhba3:C0:T1:L7 is the last redundant path to go down for a specified storage device, you see this message:

Lost path redundancy to storage device naa.xxxxxxxx. Path vmhba3:C0:T1:L7 is down. Affected datastores: Datastore1

The situation indicates a single point of failure and poses a performance impact in the virtual machines currently accessing the storage device.

The storage device reading vmhba3:C0:T1:L7 in the example contains several potential failure points:

  • vmhba35 – HBA (Host Bus Adapter)
  • C1 – Channel
  • T0 – Target (Storage processor port)
  • L7 – LUN (Logical Unit Number or Disk Unit)
Path redundancy to the storage device is degraded
  • ESX host has lost a redundant path to a storage device and there are still more than one active paths left.
  • The Task and Events tab of the ESX host on the vSphere Client reports the error:

    Path redundancy to storage device DeviceName degraded. Path vmhbaN:Cx:Ty:Lz is down. Affected datastores: DatastoreName

  • ESX host and virtual machines continue to access the storage device through the remaining active paths.
Solution

If vmhba3:C0:T1:L7, which is one of the many active paths to a specified storage device, goes down, you see this message:

Path redundancy to storage device naa.xxxxxx degraded. Path vmhba3:C0:T1:L7 is down. Affected datastores: Datastore1

This situation indicates a single point of failure and poses a performance impact in the virtual machines currently accessing the storage device.

The storage device reading vmhba3:C0:T1:L7 in the example contains several potential failure points:

  • vmhba3 – HBA (Host Bus Adapter)
  • C0 – Channel 0
  • T1 – Target 1 (Storage processor port)
  • L7 – LUN 7 (Logical Unit Number or Disk Unit)

Examples :

Taking an example of one LUN in each array :

1: Tegile : We have 2 paths form vmbha3 and vmhba2 ,vmhba3 went down and we are left with one path here . ESXi host has lost all the redundant paths to a storage device and there is only one active path left. Hence , the error “Lost path redundancy to storage”

naa.61c5a0b06b59c80100005d0bb7990010:
Device Display Name: TEGILE Fibre Channel Disk (naa.61c5a0b06b59c80100005d0bb7990010)
Storage Array Type: VMW_SATP_ALUA
Storage Array Type Device Config: {implicit_support=on; explicit_support=off; explicit_allow=on; alua_followover=on; action_OnRetryErrors=off; {TPG_id=9,TPG_state=AO}{TPG_id=8,TPG_state=AO}}
Path Selection Policy: VMW_PSP_RR
Path Selection Policy Device Config: {policy=rr,iops=1,bytes=10485760,useANO=0; lastPathIndex=1: NumIOsPending=0,numBytesPending=0}
Path Selection Policy Device Custom Config:
Working Paths: vmhba2:C0:T6:L10, vmhba3:C0:T6:L10
Is USB: false

2019-07-29T21:42:03.978Z: [scsiCorrelator] 8734477455956us: [esx.problem.storage.redundancy.lost] Lost path redundancy to storage device naa.61c5a0b06b59c80100005d0bb7990010. Path vmhba3:C0:T6:L10 is down. Affected datastores: “TowerVCL2_Tier1_STTEG1_DS11”.
2019-07-29T21:46:57.865Z: [scsiCorrelator] 8734771343649us: [esx.clear.storage.redundancy.restored] Path redundancy to storage device naa.61c5a0b06b59c80100005d0bb7990010 (Datastores: “TowerVCL2_Tier1_STTEG1_DS11”) restored. Path vmhba3:C0:T6:L10 is active again.

2: Pure: 2 paths went down from vmbha3 however we donot see vmhba2 impacted here so the paths on vmhba2 were active .So,ESXi host has lost a redundant path to a storage device and there are still more than one active paths left.Hence the error “Path redundancy to storage device”

naa.624a93702bd54de17bba40e700011022:
Device Display Name: PURE Fibre Channel Disk (naa.624a93702bd54de17bba40e700011022)
Storage Array Type: VMW_SATP_ALUA
Storage Array Type Device Config: {implicit_support=on; explicit_support=off; explicit_allow=on; alua_followover=on; action_OnRetryErrors=off; {TPG_id=0,TPG_state=AO}}
Path Selection Policy: VMW_PSP_RR
Path Selection Policy Device Config: {policy=iops,iops=1,bytes=10485760,useANO=0; lastPathIndex=0: NumIOsPending=0,numBytesPending=0}
Path Selection Policy Device Custom Config:
Working Paths: vmhba2:C0:T4:L254, vmhba3:C0:T5:L254, vmhba2:C0:T5:L254, vmhba3:C0:T4:L254
Is USB: false

2019-07-29T21:42:03.955Z: [scsiCorrelator] 8734477433429us: [esx.problem.storage.redundancy.degraded] Path redundancy to storage device naa.624a93702bd54de17bba40e700011022 degraded. Path vmhba3:C0:T4:L254 is down. Affected datastores: “TowerVCL2_Tier0_STpureC_DS9”.
2019-07-29T21:42:03.967Z: [scsiCorrelator] 8734477445626us: [esx.problem.storage.redundancy.degraded] Path redundancy to storage device naa.624a93702bd54de17bba40e700011022 degraded. Path vmhba3:C0:T5:L254 is down. Affected datastores: “TowerVCL2_Tier0_STpureC_DS9”.
2019-07-29T21:46:57.928Z: [scsiCorrelator] 8734771405808us: [esx.clear.storage.redundancy.restored] Path redundancy to storage device naa.624a93702bd54de17bba40e700011022 (Datastores: “TowerVCL2_Tier0_STpureC_DS9”) restored. Path vmhba3:C0:T4:L254 is active again.

3: COMPELNT: vmbha3 both the paths went down , and vmhba 2 never went down ,hence there was access . So,ESXi host has lost a redundant path to a storage device and there are still more than one active paths left. Hence the error “Path redundancy to storage device”

naa.6000d310007ef5000000000000000096:
Device Display Name: COMPELNT Fibre Channel Disk (naa.6000d310007ef5000000000000000096)
Storage Array Type: VMW_SATP_ALUA
Storage Array Type Device Config: {implicit_support=on; explicit_support=off; explicit_allow=on; alua_followover=on; action_OnRetryErrors=off; {TPG_id=61486,TPG_state=AO}{TPG_id=61490,TPG_state=AO}{TPG_id=61485,TPG_state=AO}{TPG_id=61489,TPG_state=AO}}
Path Selection Policy: VMW_PSP_RR
Path Selection Policy Device Config: {policy=iops,iops=1,bytes=10485760,useANO=0; lastPathIndex=2: NumIOsPending=0,numBytesPending=0}
Path Selection Policy Device Custom Config:
Working Paths: vmhba3:C0:T2:L6, vmhba3:C0:T3:L6, vmhba2:C0:T2:L6, vmhba2:C0:T3:L6
Is USB: false

2019-07-29T21:42:03.912Z: [scsiCorrelator] 8734477390138us: [esx.problem.storage.redundancy.degraded] Path redundancy to storage device naa.6000d310007ef5000000000000000096 degraded. Path vmhba3:C0:T2:L6 is down. Affected datastores: “TowerVCL2_Tier1_STcomp1_DS6”.
2019-07-29T21:42:03.916Z: [scsiCorrelator] 8734477394585us: [esx.problem.storage.redundancy.degraded] Path redundancy to storage device naa.6000d310007ef5000000000000000096 degraded. Path vmhba3:C0:T3:L6 is down. Affected datastores: “TowerVCL2_Tier1_STcomp1_DS6”.
2019-07-29T21:46:57.877Z: [scsiCorrelator] 8734771354867us: [esx.clear.storage.redundancy.restored] Path redundancy to storage device naa.6000d310007ef5000000000000000096 (Datastores: “TowerVCL2_Tier1_STcomp1_DS6”) restored. Path vmhba3:C0:T2:L6 is active again.

To fix the issue please follow:

https://kb.vmware.com/s/article/1009554

https://kb.vmware.com/s/article/1009555

VM is showing disk size of 0B

Sometime we see Backup failed for that VM with the error message :

Virtual disk configuration change detected, resetting CBT failed Details: A general system error occurred: 
Creating VM snapshot
Error: A general system error occurred: 

There are few documents which you can find for this error .I couldn’t reboot the host as this was a production cluster. After some homework, i figured out some ways to fix the issue.

1: Reboot the Virtual Machine.

This task will refresh the .vmx file and you will get your VM in a healthy state back again.

2: If this happens there is no guarantee that you will be able to recover the VM. So better have an older backup ready, and recover the Virtual machine.

3: The virtual machine’s .vmx configuration file can be reloaded from the command line. This operation does not generate a new Inventory ID (Vmid) for the virtual machine and allows it to stay in the same resource pool.

To resolve this issue, reload the virtual machine’s .vmx configuration file.

To reload the virtual machine’s .vmx configuration file, perform one of these options:

  • Reload the configuration file of all the virtual machines on the ESXi/ESX host using a script by running this command:for a in $(vim-cmd vmsvc/getallvms 2>&1 |grep invalid |awk ‘{print $4}’|cut -d \’ -f2);do vim-cmd vmsvc/reload $a;done
    • Reload the .vmx configuration file from the command line:
    1. Log in to the Local Tech Support Mode console of the ESXi/ESX host.
    2. Obtain the Inventory ID (Vmid) for the virtual machine using this command:# vim-cmd vmsvc/getallvms

      Note: The output shows virtual machines which are registered on the ESXi/ESX host.

      You see output similar to:

      Vmid Name File Guest OS Version Annotation
      2848 Test.vmx winNetEnterpriseGuest vmx-07 To be used as a template

      In this example, the Vmid is 2848.

    3. Reload the .vmx file using this command:# vim-cmd vmsvc/reload Vmid

    Either of the steps should work to fix the issue.

 

 

 

 

Enable or Disable UEFI Secure Boot for a Virtual Machine

UEFI Secure Boot is a security standard that helps ensure that your PC boots using only software that is trusted by the PC manufacturer. For certain virtual machine hardware versions and operating systems, you can enable secure boot just as you can for a physical machine.

In an operating system that supports UEFI secure boot, each piece of boot software is signed, including the bootloader, the operating system kernel, and operating system drivers. The virtual machine’s default configuration includes several code signing certificates.

  • A Microsoft certificate that is used only for booting Windows.
  • A Microsoft certificate that is used for third-party code that is signed by Microsoft, such as Linux bootloaders.
  • A VMware certificate that is used only for booting ESXi inside a virtual machine.

The virtual machine’s default configuration includes one certificate for authenticating requests to modify the secure boot configuration, including the secure boot revocation list, from inside the virtual machine, which is a Microsoft KEK (Key Exchange Key) certificate.

In almost all cases, it is not necessary to replace the existing certificates.

VMware Tools version 10.1 or later is required for virtual machines that use UEFI secure boot. You can upgrade those virtual machines to a later version of VMware Tools when it becomes available.

For Linux virtual machines, VMware Host-Guest Filesystem is not supported in secure boot mode. Remove VMware Host-Guest Filesystem from VMware Tools before you enable secure boot.

Note: If you turn on secure boot for a virtual machine, you can load only signed drivers into that virtual machine.

This task describes how to use the vSphere Client to enable and disable secure boot for a virtual machine. You can also write scripts to manage virtual machine settings. For example, you can automate changing the firmware from BIOS to EFI for virtual machines with the following PowerCLI code:

$vm = Get-VM TestVM $spec = New-Object VMware.Vim.VirtualMachineConfigSpec $spec.Firmware = [VMware.Vim.GuestOsDescriptorFirmwareType]::efi $vm.ExtensionData.ReconfigVM($spec)

See VMware PowerCLI User’s Guide for more information.

Prerequisites

You can enable secure boot only if all prerequisites are met. If prerequisites are not met, the check box is not visible in the vSphere Client.

  • Verify that the virtual machine operating system and firmware support UEFI boot.
    • EFI firmware
    • Virtual hardware version 13 or later.
    • Operating system that supports UEFI secure boot.
    Note: Some guest operating systems do not support changing from BIOS boot to UEFI boot without guest OS modifications. Consult your guest OS documentation before changing to UEFI boot. If you upgrade a virtual machine that already uses UEFI boot to an operating system that supports UEFI secure boot, you can enable Secure Boot for that virtual machine.
  • Turn off the virtual machine. If the virtual machine is running, the check box is dimmed.

Procedure

  1. Browse to the virtual machine in the vSphere Client inventory.
  2. Right-click the virtual machine and select Edit Settings.
  3. Click the VM Options tab, and expand Boot Options.
  4. Under Boot Options, ensure that firmware is set to EFI.
  5. Select your task.
    • Select the Secure Boot check box to enable secure boot.
    • Deselect the Secure Boot check box to disable secure boot.
  6. Click OK.

Results

When the virtual machine boots, only components with valid signatures are allowed. The boot process stops with an error if it encounters a component with a missing or invalid signature.

General Recommendations for Boot from iSCSI SAN

If you plan to set up and use an iSCSI LUN as the boot device for your host, follow certain general guidelines.

The following guidelines apply to booting from the independent hardware iSCSI and iBFT.

  • Review any vendor recommendations for the hardware you use in your boot configuration.
  • For installation prerequisites and requirements, review vSphere Installation and Setup.
  • Use static IP addresses to reduce the chances of DHCP conflicts.
  • Use different LUNs for VMFS datastores and boot partitions.
  • Configure proper ACLs on your storage system.
    • The boot LUN must be visible only to the host that uses the LUN. No other host on the SAN is permitted to see that boot LUN.
    • If a LUN is used for a VMFS datastore, multiple hosts can share the LUN.
  • Configure a diagnostic partition.
    • With the independent hardware iSCSI only, you can place the diagnostic partition on the boot LUN. If you configure the diagnostic partition in the boot LUN, this LUN cannot be shared across multiple hosts. If a separate LUN is used for the diagnostic partition, multiple hosts can share the LUN.
    • If you boot from SAN using iBFT, you cannot set up a diagnostic partition on a SAN LUN. To collect your host’s diagnostic information, use the vSphere ESXi Dump Collector on a remote server. For information about the ESXi Dump Collector, see vCenter Server Installation and Setup and vSphere Networking.

Breaking Device Locks using vmkfstools

VMFS volume on the VMware ESX/ESXi host is locked due to an I/O error.

Example:

If naa.xxxxxx:1 represents one of the partitions used in a VMFS volume, you see this message when the VMFS volume is inaccessible:
volume on device naa.xxxxxx:1 locked, possibly because remote host x.x.x.x encountered an error during a volume operation and couldn’t recover.

If this issue occurs, the VMFS volume (and the virtual machines residing on the affected volume) are unavailable to the ESX/ESXi host.

In the /var/log/vmkernel.log file, you may see similar message indicating the same issue:
WARNING: LVM: 12345: The volume on the device naa.xxxxxx:1 locked, possibly because some remote host encountered an error during a volume operation and could not recover.
LVM: 11245: Failed to open device naa.xxxxxx:1 : Lock was not free

 

Use the vmkfstools command to break the device lock on a particular partition.

-B|–breaklock device

When entering the device parameter, use the following format:

/vmfs/devices/disks/disk_ID:P

You can use this command when a host fails in the middle of a datastore operation, such as expand the datastore, add an extent, or resignature. When you run this command, make sure that no other host is holding the lock.

 

  1. To break the lock:
    1. Break the existing LVM lock on the datastore by running this command:

      # vmkfstools –B vmfs device

      Note: You can also use the parameter –breaklock instead of -B with the vmkfstools command.

      From the preceding error message, this command is used:

      # vmkfstools -B /vmfs/devices/disks/naa.60060160b3c018009bd1e02f725fdd11:1

      You see output similar to:

      VMware ESX Question:
      LVM lock on device /vmfs/devices/disks/naa.60060160b3c018009bd1e02f725fdd11:1 will be forcibly broken. Please consult vmkfstools or ESX documentation to understand the consequences of this.

      Please ensure that multiple servers aren’t accessing this device.

      Continue to break lock?
      0) Yes
      1) No

      Please choose a number [0-1]:

    2. Enter 0 to break the lock.
    3. Re-read and reload VMFS datastore metadata to memory by running this command:

      # vmkfstools –V

    4. From the vSphere UI, refresh the Storage Datastores View under Configuration tab.

Note: This issue can also be resolved by restarting all the hosts in the cluster.

 

Managing SCSI Reservations of LUNs using vmkfstools

Use the vmkfstools command to reserve a SCSI LUN for exclusive use by the ESXi host. You can also release a reservation so that other hosts can access the LUN, and reset a reservation, forcing all reservations from the target to be released.

-L|–lock [reserve|release|lunreset|targetreset|busreset|readkeys|readresv] device

Caution:Using the -L option can interrupt the operations of other servers on a SAN. Use the -L option only when troubleshooting clustering setups.

Unless advised by VMware, never use this option on a LUN hosting a VMFS volume.

You can specify the -L option in several ways:

  • -L reserve – Reserves the specified LUN. After the reservation, only the server that reserved that LUN can access it. If other servers attempt to access that LUN, a reservation error appears.
  • -L release – Releases the reservation on the specified LUN. Other servers can access the LUN again.
  • -L lunreset – Resets the specified LUN by clearing any reservation on the LUN and making the LUN available to all servers again. The reset does not affect any of the other LUNs on the device. If another LUN on the device is reserved, it remains reserved.
  • -L targetreset – Resets the entire target. The reset clears any reservations on all the LUNs associated with that target and makes the LUNs available to all servers again.
  • -L busreset – Resets all accessible targets on the bus. The reset clears any reservation on all the LUNs accessible through the bus and makes them available to all servers again.
  • -L readkeys – Reads the reservation keys registered with a LUN. Applies to SCSI-III persistent group reservation functionality.
  • -L readresv – Reads the reservation state on a LUN. Applies to SCSI-III persistent group reservation functionality.

When entering the device parameter, use the following format:

/vmfs/devices/disks/disk_ID:x

Space Reclamation Requests from Guest OS

ESXi supports the unmap commands issued directly from a guest operating system to reclaim storage space. The level of support and requirements depend on the type of datastore where your virtual machine resides.

Inside a virtual machine, storage space is freed when, for example, you delete files on the thin virtual disk. The guest operating system notifies VMFS about freed space by sending the unmap command. The unmap command sent from the guest operating system releases space within the VMFS datastore. The command then proceeds to the array, so that the array can reclaim the freed blocks of space.

Space Reclamation for VMFS6 Virtual Machines

VMFS6 generally supports automatic space reclamation requests that generate from the guest operating systems, and passes these requests to the array. Many guest operating systems can send the unmap command and do not require any additional configuration. The guest operating systems that do not support the automatic unmaps might require user intervention. For information about guest operating systems that support the automatic space reclamation on VMFS6, contact your vendor.

Generally, the guest operating systems send the unmap commands based on the unmap granularity they advertise. For details, see documentation provided with your guest operating system.

The following considerations apply when you use space reclamation with VMFS6:

  • VMFS6 processes the unmap request from the guest OS only when the space to reclaim equals 1 MB or is a multiple of 1 MB. If the space is less than 1 MB or is not aligned to 1 MB, the unmap requests are not processed.
  • For VMs with snapshots in the default SEsparse format, VMFS6 supports the automatic space reclamation only on ESXi hosts version 6.7 or later. If you migrate VMs to ESXi hosts version 6.5 or earlier, the automatic space reclamation stops working for the VMs with snapshots.

    Space reclamation affects only the top snapshot and works when the VM is powered on.

Space Reclamation for VMFS5 Virtual Machines

Typically, the unmap command that generates from the guest operation system on VMFS5 cannot be passed directly to the array. You must run the esxcli storage vmfs unmap command to trigger unmaps for the array.

However, for a limited number of the guest operating systems, VMFS5 supports the automatic space reclamation requests.

To send the unmap requests from the guest operating system to the array, the virtual machine must meet the following prerequisites:

  • The virtual disk must be thin-provisioned.
  • Virtual machine hardware must be of version 11 (ESXi 6.0) or later.
  • The advanced setting EnableBlockDelete must be set to 1.
  • The guest operating system must be able to identify the virtual disk as thin.

Reclaiming Space with SCSI Unmap

vSAN 6.7 Update 1 and later supports SCSI UNMAP commands that enable you to reclaim storage space that is mapped to a deleted vSAN object.

Deleting or removing files frees space within the file system. This free space is mapped to a storage device until the file system releases or unmaps it. vSAN supports reclamation of free space, which is also called the unmap operation. You can free storage space in the vSAN datastore when you delete or migrate a VM, consolidate a snapshot, and so on.

Reclaiming storage space can provide higher host-to-flash I/O throughput and improve flash endurance.

vSAN also supports the SCSI UNMAP commands issued directly from a guest operating system to reclaim storage space. vSAN supports offline unmaps as well as inline unmaps. On Linux OS, offline unmaps are performed with the fstrim(8) command, and inline unmaps are performed when the mount -o discard command is used. On Windows OS, NTFS performs inline unmaps by default.

Unmap capability is disabled by default. To enable unmap on a vSAN cluster, use the following RVC command: vsan.unmap_support –enable

When you enable unmap on a vSAN cluster, you must power off and then power on all VMs. VMs must use virtual hardware version 13 or above to perform unmap operations.