Improving network performance using network I/O control

The 1 GigE era is coming to an end and is rapidly being replaced by 10 GigE. This means less physical network connections and that network traffic with different patterns and needs will merge together on the same network.

This may directly impact performance and predictability due to lack of isolation, scheduling, and arbitration. Network I/O control can be used to prioritize different network traffic on the same pipe.

You cannot have guaranteed bandwidth as long as you don't limit other traffic, so there'll always be enough available bandwidth.

As some traffic (that is, vMotion) might not be used all the time, we'll have temporarily unused bandwidth with static limits. As long as there is no congestion, this doesn't really matter, but if there is, then you're limiting the bandwidth for traffic even when there is bandwidth available, which is not a good way to deal with congestion.

For some VMkernel traffic, VMware recommends dedicated 1 GB NICs to guarantee bandwidth for them. So it's very likely that there will be no dedicated NIC anymore for VM traffic. Without some QoS or guaranteed bandwidth, you're wasting the bandwidth with the previous static traffic shaping. So there needs to be a more dynamic solution for it.

So the solution is network I/O control (NIOC) for vDS.

VMware has predefined resource groups, as shown in the previous screenshot.

These traffic groups are based on vDS ports. So with the connection to a specified port, the traffic on this port belongs to one of these predefined groups. This means that if we mount an NFS share inside a VM, it's treated as VM traffic instead of NFS traffic. Again, assignment to a group is based on the switch port, not on traffic characteristics.

Two values that you can edit for the predefined Network resource pool are Physical adapter shares and Host limit.

Shares work like the shares on VMs (CPU and memory). NIOC will sum up all the shares and set the shares in relation to the sum. So a value of 50 does not necessarily mean the group is entitled to 50 percent of bandwidth (although, it could be possible if the shares of the other groups sum up to 50).

The share does not reflect the number of ports in this group. So it doesn't matter if we have 10 or 100 VMs running. The percentage of all will be the same. Therefore, inside a resource group, there is no sublevel possible.

The defined share values are:

  • Normal = 50
  • Low = 25
  • High = 100
  • Custom = any values between 1 and 100

The default share value of VM traffic is 100 and all other traffic is 50, which is normal:

In case some resource groups aren't used at all--that is, no FT (short for fault tolerance), no NFS, and no vSphere Replication)--the shares still apply but as there will be no traffic claiming bandwidth, the bandwidth will be given to other groups based on the share values.

So if all of the traffic together doesn't need the full bandwidth of pNIC, there might still be unused bandwidth. But it will be dynamically given to a resource group requesting more bandwidth later. Previously, this bandwidth was not used at all as the limits were configured with peak rates and bursts.

Limits are useful if you don't want to have other traffic affected too much, that is, by vMotion. Let's assume that the VM and iSCSI traffic usually uses nearly all of the available bandwidth. Now, if vMotion starts consuming 14 percent of bandwidth by default, then this will affect the traffic of the VMs and iSCSI, so you might want to limit it to 1 Gbit/s.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset