Allow wait_irq to be called in 32bit code.

If wait_irq() is called from 32bit code, then jump to 16bit mode for
the wait.

Have wait_irq check for threads, and have it use yield if threads are
pending.  This ensures threads aren't delayed if anything calls

Use wait_irq() in 32bit mode during a failed boot.
diff --git a/Makefile b/Makefile
index d0b8881..72d711d 100644
--- a/Makefile
+++ b/Makefile
@@ -20,7 +20,7 @@
 SRC32FLAT=$(SRCBOTH) post.c shadow.c memmap.c coreboot.c boot.c \
       acpi.c smm.c mptable.c smbios.c pciinit.c optionroms.c mtrr.c \
       lzmadecode.c usb-hub.c paravirt.c
-SRC32SEG=util.c output.c pci.c pcibios.c apm.c
+SRC32SEG=util.c output.c pci.c pcibios.c apm.c stacks.c
 cc-option = $(shell if test -z "`$(1) $(2) -S -o /dev/null -xc \
               /dev/null 2>&1`"; then echo "$(2)"; else echo "$(3)"; fi ;)