Trouble shooting slow NFS datastores

Trouble shooting slow NFS datastores
July 26, 2022 No Comments ESXi,Uncategorized,VMware neudorfer

I was recently approached by our server team that we were seeing slow transfer rates on VMs. They noticed it first in their application but were able to run a dd command to shower certain VMs were only getting 4MB/s when they should have been getting up to 1GB/s. This only happened when ran against our datastore. 

 

dd if=/dev/sda of=/dev/null bs=1M count=100 status=progress52428800 bytes (52 MB) copied, 10.988102 s, 4.8 MB/s

 

The problem persisted across multiple ESXi hosts but not on others in the same cluster. This narrowed it down to something on this rack which probably meant an issue at the top of rack switch but I couldn’t figure out what. I decided to run a packet capture on one of the offending hosts.

pktcap-uw –uplink vmnic1 -o /tmp/packetcapture.pcap

I noticed enough malformed packets and retransmits that I decided to check errors on our vmnic ports.

 

esxcli network nic stats get -n vmnic0 | egrep “Total receive errors|Receive CRC errors|Receive missed errors”

and

esxcli network nic stats get -n vmnic1 | egrep “Total receive errors|Receive CRC errors|Receive missed errors”

I checked multiple hosts and saw the same across the board. We had a high level of CRC errors on one port but not the other 

 

 

slow NFS

Bad host

Bad ports…..

Totalreceiveerrors!!!!!

Lots of Totalreceiveerrors!!!!!!

Multicast bad actor? 

packet capture

…. to be continued

About The Author