I recently finished working a case for a new customer with VMware support and Cisco TAC. One of their hosts became totally unresponsive (VMs were still up, but host was disconnected in vCenter and you could not even console directly into the host). This also happened to a second host about a week prior. Both required reboots to fix the issue.
According to Cisco TAC, this is a known bug with ESXi 5 and UCS C-series. Cisco’s short-term suggestion was to disable Interrupt Remapping either via ESX or the BIOS – they claim that they had one running in a lab in this configuration that had been 3 weeks without an issue. Supposedly, Cisco is releasing a BIOS upgrade that will correct this issue “within the next couple of weeks.” Unfortunately, it has already been several weeks since this incident with no BIOS update but I will update this post as soon as I hear otherwise.
On a mostly unrelated but similarly important note, Cisco also confirmed that Call Manager is not yet supported with ESX 5.
Interrupt Remapping directions from VMware:
Link for latest UCS Update Manager (for when the new BIOS is released):