UPDATE: A full research report on the NMI and other passthrough attacks is available online at: http://www.crysys.hu/~pek/pubs/Pek+14ACMAsiaCCS.pdf

Gábor Pék (CrySyS Lab)  started to explore possible vulnerabilities in the use of directly assigned (passthrough) devices in contemporary hypervisors (Xen and KVM). As a result of this research, he pointed out some misbehaviors in the interrupt handling method of current platforms. One of this issues is going to be presented in this article.  A paper is to be published about all the discovered issues in collaboration with other researchers from Eurecom, France.

Description

Direct-device assignment is one of the most controversial issues in hardware virtualization, as it allows for using devices almost at native speed, however, raises many security problems.  As most of these issues can be evaded by properly configured system software and hardware, the security issues of that area seemed to be solved.  At the same time, virtual instances with direct-device assignment are publicly available  via various cloud providers, so the security issues have to be examined in more details. In this article,  an interesting vulnerability is going to be presented which is not a simple software bug, but an example for an issue on how to handle improperly a hardware-level mechanism: the interrupt generation.

More precisely, native host-side Non-Maskable Interrupts (NMI) can be generated on systems (e.g., Xen, KVM etc) with System Error Reporting (SERR) enabled via a passthrough device being attached to a fully virtualized  (HVM) guest even when all the available hardware  protection mechanisms are turned on (Intel VT-d DMA and Interrupt Remapping).

As a result of the NMI, the corresponding host-side NMI handler is executed which may cause Denial of Service (DoS).

To reproduce the issue, the attacker has to create a malformed MSI request by writing to the LAPIC MMIO space (0xFEExxxxx) in the guest physical address range. This can be accomplished by a modified DMA transaction. Examples for malformed MSI requests include Interrupt Vector numbers being set below 16 or MSI requests with invalid size (not DWORD).

Output of Xen 4.2 Dom0 after the attack (xl dmesg):

(XEN) NMI – PCI system error (SERR)

Output of KVM 3.4 host after the attack (/var/log/kern.log):

NMI: PCI system error (SERR) for reason a1 on CPU 0.
Dazed and confused, but trying to continue

The NMI is reported as a result of an ‘Unsupported Request’ PCI system error, and has nothing to do with compatibility format MSIs with delivery mode of NMI. Note that Interrupt Remapping disables the use of compatibility format MSIs by clearing the GCMD.CFI flag. By turning x2APIC mode on system software cannot even reconfigure that flag.

Later investigations showed that the host-side NMI was not generated by the PCI device being assigned to a fully virtualized (hardware-assisted) guest, but the Host Bridge(00:00.0). See the output of lspci.

Before the attack:

00:00.0 Host bridge: Intel Corporation 2nd Generation Core Processor Family DRAM Controller (rev 09)
Subsystem: …
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
Latency: 0
Capabilities: [e0] Vendor Specific Information: Len=0c <?>

After the attack:

00:00.0 Host bridge: Intel Corporation 2nd Generation Core Processor Family DRAM Controller (rev 09)
Subsystem: …
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR+ <PERR- INTx-
Latency: 0
Capabilities: [e0] Vendor Specific Information: Len=0c <?>

Vulnerable Systems

Affected platforms enable System Error Reporting, where the corresponding NMI handler is executed in the hypervisor/host OS that should have never happened in normal circumstances when Intel VT-d Interrupt Remapping is enabled. Note that only Xen and KVM were tested and verified, however, every other system software can be affected which runs above a platform with SERR and Intel VT-d enabled.

Mitigation
While Intel does not recommend to turn SERR reporting on by default, some platforms do enable it as it carries essential information about legal system errors.

SERR reporting can either be disabled for the Host Bridge, or system software can block SERR error signaling due to Unsupported Request error resulting from malformed MSI requests. The former advice is quite intrusive as it suppresses all the system errors coming from the Host Bridge. At the same time, this is supported by all the chipsets. The second option is a more fine-grained solution, however, there is no information whether it is applicable to all Intel chipsets.

As a consequence, there is no real solution available for the issue for now.

References

Corresponding CVE number: CVE-2013-3495
Corresponding Xen Security Advisory number: XSA-59

Acknowledgement

Many thanks go to Jan Beulich from the Xen security team, the Intel Product Security Incident Response Team and Rafal Wojtczuk from Bromium for the in-depth discussions, recommendations and advices.

More details about this issue is going to be published in our upcoming paper.

May you have any questions please let us know.

 

 

Leave a Reply