feat: add hardware interrupt support (PIC, KVM IRQ chip, MSHV SynIC, WHP software timer)#1272
feat: add hardware interrupt support (PIC, KVM IRQ chip, MSHV SynIC, WHP software timer)#1272danbugs wants to merge 9 commits intohyperlight-dev:mainfrom
Conversation
- Add PvTimerConfig (port 107) and Halt (port 108) to OutBAction enum - Add userspace 8259A PIC emulation for MSHV/WHP (KVM uses in-kernel PIC) - Add halt() function in guest exit module using Halt port instead of HLT - Add default no-op IRQ handler at IDT vector 0x20 for PIC-remapped IRQ0 - Update dispatch epilogue to use Halt port before cli+hlt fallback - Add hw-interrupts feature flag to hyperlight-host Signed-off-by: danbugs <danilochiarlone@gmail.com>
- Create IRQ chip (PIC + IOAPIC + LAPIC) and PIT before vCPU creation - Add hw-interrupts run loop that handles HLT re-entry, Halt port, and PvTimerConfig port (ignored since in-kernel PIT handles scheduling) - Non-hw-interrupts path also recognizes Halt port for compatibility Signed-off-by: danbugs <danilochiarlone@gmail.com>
- Enable LAPIC in partition flags for SynIC direct-mode timer delivery - Configure LAPIC (SVR, TPR, LINT0/1, LVT Timer) during VM creation - Install MSR intercept on IA32_APIC_BASE to prevent guest from disabling the LAPIC globally - Add SynIC STIMER0 configuration via PvTimerConfig IO port - Add userspace PIC emulation integration for MSHV - Restructure run_vcpu into a loop for HLT re-entry and hw IO handling - Bridge PIC EOI to LAPIC EOI for SynIC timer interrupt acknowledgment - Handle PIT/speaker/debug IO ports in userspace Signed-off-by: danbugs <danilochiarlone@gmail.com>
Add WHP hardware interrupt support using a host-side software timer thread that periodically injects interrupts via WHvRequestInterrupt. Key changes: - Detect LAPIC emulation support via WHvGetCapability - Initialize LAPIC via bulk interrupt-controller state API (WHvGet/SetVirtualProcessorInterruptControllerState2) since individual APIC register writes fail with ACCESS_DENIED - Software timer thread for periodic interrupt injection - LAPIC EOI handling for PIC-only guest acknowledgment - PIC emulation integration for MSHV/WHP shared 8259A - Filter APIC_BASE from set_sregs when LAPIC emulation active - HLT re-entry when timer is active Signed-off-by: danbugs <danilochiarlone@gmail.com>
Signed-off-by: danbugs <danilochiarlone@gmail.com>
Signed-off-by: danbugs <danilochiarlone@gmail.com>
268cf83 to
ddd016c
Compare
…mpatibility MSHV and WHP reject vCPU state with zeroed segment registers (ES, SS, FS, GS, LDT) and uninitialized XSAVE areas. Properly initialize all segment registers in standard_real_mode_defaults() and add reset_xsave() call after set_sregs() to ensure FPU state (FCW, MXCSR) is valid. Signed-off-by: danbugs <danilochiarlone@gmail.com>
Add Halt outb (port 108) before cli/hlt in guest init and dummyguest so KVM's in-kernel LAPIC does not absorb the HLT exit. Also restore the hw_timer_interrupts integration test that was inadvertently dropped. Signed-off-by: danbugs <danilochiarlone@gmail.com>
Signed-off-by: danbugs <danilochiarlone@gmail.com>
syntactically
left a comment
There was a problem hiding this comment.
Beyond this immediate use case, hardware interrupt support is a prerequisite for the ring buffer notifier mechanism planned for the tandr/ring branch — the upstream Notifier trait needs a way to signal the guest that new work is available in a virtqueue/ring buffer, and hardware interrupts are the natural delivery mechanism for that (matching the virtio model of interrupt-driven I/O notification).
I'm not sure we've made this decision yet. I believe the intention was to benchmark whether that made sense, or if a custom ABI (say some register flag set the next time that the guest was reentered through one of the existing entrypoint stubs) ended up being faster (since it would allow some extra trips up and down through the hv).
I'm actually a bit curious if we have similar data here as well---maybe the complexity of emulating a good fraction of an interrupt controller is worth it for the performance in the KVM case where there's extra support for it, but especially in the other cases, are we sure this is actually any better than just having a custom interface for "jump to this address every so often"? It seems like we don't really need all the complex interrupt routing, priority, etc parts of the interrupt controller---we just need the timer pulse?
Where the guest expects a legacy PIC but interrupts are actually delivered via LAPIC
Since we don't intend to actually have a PIC at any point, can we just modify the guest to get rid of this assumption when it's being built for the Hyperlight platform?
| use hyperlight_common::outb::OutBAction; | ||
| use tracing::instrument; | ||
|
|
||
| /// Halt the execution of the guest and returns control to the host. |
There was a problem hiding this comment.
Comment (and maybe name) (and commit message) should clarify that this is meant to be used for wfi rather than actually ending execution as we've been using hlt for in the past?
| hl_exception_handler = sym super::handle::hl_exception_handler, | ||
| ); | ||
|
|
||
| // Default no-op IRQ handler for hardware interrupts (vectors 0x20-0x2F). |
There was a problem hiding this comment.
AFAIUI this is code that should be running in target_arch = i686 only right now, so perhaps ought not be in amd64?
|
|
||
| // Default no-op IRQ handler for hardware interrupts (vectors 0x20-0x2F). | ||
| // Sends a non-specific EOI to the master PIC and returns. | ||
| // This prevents unhandled-interrupt faults when the in-kernel PIT fires |
There was a problem hiding this comment.
This race condition is still present since this is only being installed in init_idt after the guest has already been running instructions for some time?
If this is a serious concern, surely we need the host to control the vm state a bit better---either only enabling interrupts after initialize() has finished and the guest kernel is up, or presetting the idt state before entering the guest for the first time (although I'm unsure if there is API for that)
| call {internal_dispatch_function}\n | ||
| mov dx, 108\n | ||
| out dx, al\n | ||
| cli\n |
There was a problem hiding this comment.
Is there any use for this wfi fallback at the end here? If the guest does resume execution on interrupt delivery from a hlt here, something has gone very wrong.
| @@ -0,0 +1,228 @@ | |||
| /* | |||
There was a problem hiding this comment.
This file should be in some arch (i686?) specific directory, I think.
| .set_lapic(&lapic) | ||
| .map_err(|e| CreateVmError::InitializeVm(e.into()))?; | ||
|
|
||
| // Install MSR intercept for IA32_APIC_BASE (MSR 0x1B) to prevent |
There was a problem hiding this comment.
This seems like something that should just be fixed by having the guest kernel not look for an I/O APIC when it is being built for hyperlight, rather than something that should be hacked around in Hyperlight.
| if let Ok(mut lapic) = self.vcpu_fd.get_lapic() { | ||
| let svr = read_lapic_u32(&lapic.regs, 0xF0); | ||
| if svr & 0x100 == 0 { | ||
| write_lapic_u32(&mut lapic.regs, 0xF0, 0x1FF); |
There was a problem hiding this comment.
Where are these values coming from?
| const LAPIC_EMULATION_BIT: u64 = 1 << 1; | ||
|
|
||
| #[cfg(feature = "hw-interrupts")] | ||
| let mut lapic_emulation = { |
There was a problem hiding this comment.
Do we need to support hosts without LAPIC emulation, or could we just error here if it was missing?
| sregs.into(); | ||
|
|
||
| // When LAPIC emulation is active, skip writing APIC_BASE. | ||
| // The generic CommonSpecialRegisters defaults APIC_BASE to 0 |
There was a problem hiding this comment.
Should this be changed in the CommonSpecialRegisters code? Is it also wrong on other HVs?
|
|
||
| #[cfg(not(feature = "init-paging"))] | ||
| pub(crate) fn standard_real_mode_defaults() -> Self { | ||
| // In real mode, all data/code segment registers must have valid |
Summary
Adds hardware interrupt support to Hyperlight, enabling guest OS kernels to receive timer interrupts for preemptive scheduling. Each hypervisor backend uses its native interrupt delivery mechanism:
WHvRequestInterruptfor periodic interrupt injection, using the bulk LAPIC state API for initializationAll implementations are gated behind the
hw-interruptscargo feature and have no effect on existing behavior when the feature is disabled.Motivation
Nanvix is a microkernel that requires preemptive scheduling via timer interrupts. Beyond this immediate use case, hardware interrupt support is a prerequisite for the ring buffer notifier mechanism planned for the
tandr/ringbranch — the upstreamNotifiertrait needs a way to signal the guest that new work is available in a virtqueue/ring buffer, and hardware interrupts are the natural delivery mechanism for that (matching the virtio model of interrupt-driven I/O notification).Key design decisions
No PvTimer abstraction
Each platform goes directly to its native mechanism rather than using a common timer abstraction. This avoids lowest-common-denominator limitations — KVM's in-kernel PIT is zero-overhead, MSHV's SynIC timer is hypervisor-native, and WHP's software timer uses async interrupt injection to work with WHP's blocking
WHvRunVirtualProcessor.Guest halt mechanism
The guest signals "I'm done" by writing to
OutBAction::Halt(port 108) instead of using the HLT instruction. With an in-kernel LAPIC (KVM) or SynIC (MSHV), HLT is absorbed by the hypervisor to wait for the next interrupt — it never reaches userspace as a VM exit. The Halt port write always triggers a VM exit, giving Hyperlight a clean signal to stop the vCPU run loop.PIC emulation (MSHV/WHP)
A minimal userspace 8259A PIC emulation handles the interrupt acknowledge cycle for MSHV and WHP, where the guest expects a legacy PIC but interrupts are actually delivered via LAPIC. KVM doesn't need this because its in-kernel IRQ chip handles the full PIC/APIC routing natively.
Changes
Commit 1: Foundation — PIC, OutBAction variants, guest halt
outb.rs:OutBAction::PvTimerConfig(107) andOutBAction::Halt(108) enum variantspic.rs: Userspace 8259A PIC emulation for MSHV/WHPmod.rs: Registerpicmodule with cfg gateexit.rs: Guesthalt()function using Halt port +cli; hltsafety fallbackentry.rs: Default no-op IRQ handler at vector 0x20dispatch.rs: Use halt port in dispatch epilogueoutb.rs(host):handle_outbmatch arms for PvTimerConfig and HaltCommit 2: KVM — in-kernel IRQ chip + PIT
IrqchipandPit2set_pit2(), Halt port handlingCommit 3: MSHV — SynIC direct-mode timer
IA32_APIC_BASEto prevent guest APIC disableCommit 4: WHP — software timer thread
WHvGet/SetVirtualProcessorInterruptControllerState2)WHvRequestInterruptfor periodic interrupt injectionset_sregsAPIC_BASE filtering to prevent accidental LAPIC disableCommit 5: Tests
#[cfg_attr(feature = "hw-interrupts", ignore)]on tests that conflict with hw-interrupts modeCommit 6: CI
hw-interruptstest step indep_build_test.ymlhw-interruptsrecipe in Justfiletest-like-ciandbuild-test-like-ciTest plan
cargo clippy -p hyperlight-host --no-default-features -F "kvm,hw-interrupts,init-paging"passescargo clippy -p hyperlight-host --no-default-features -F "kvm,init-paging"passes (without hw-interrupts)cargo test -p hyperlight-host --no-default-features -F "kvm,hw-interrupts,init-paging" --lib— 77 passed, 12 ignoredcargo test -p hyperlight-host --no-default-features -F "kvm,init-paging" --lib— 83 passed, 5 ignoredcargo build -p hyperlight-host --no-default-features -F "kvm,hw-interrupts,executable_heap" --libsucceedscargo test -p hyperlight-host --no-default-features -F "init-paging,hw-interrupts" -- hw_timer_interrupts --nocapture