This introduced a potential race condition between the start of a SWI
and the BIOS handling the exception by returning to system mode. During
this ~10 instruction window, having an IRQ that issues a SWI causes bad
behaviour that results in crashes or other weirdness.
Fixes a couple of games and potentially many weird and obscure bugs here
and there (hard to reproduce sometimes).
Removed the last bits of text relocations by moving all relevant RAMs to
the stub reachable area. This should be as fast or even faster than
previous code.
Seems that ABI mandates that we allocate space for arg0..4 even if we do
pass them as registers. For some reason write_io_register<> functions
write in that stack area (1 word) corrupting the s0 saved register.
This seems to be a new gcc behaviour?
This only needs some support to save/load state with 64 bit registers.
Since pointers remain 32 bit, no extra changes are needed in the
dynarec. Verified with qemu (qemu-mipsn32el) and miniretro.
Seems that using the __atribute__ magic for sections is not the best way
of doing this, since it injects some default atributtes that collide
with the user defined ones. Using assembly is far easier in this case.
Reworked definitions a bit to make it easier to import from assembly.
Also wrapped stuff around macros for easy and less verbose
implementation of the symbol prefix issue.
This saves a few cycles in MIPS and simplifies a bit the core.
Removed the write map, only affects interpreter performance very
minimally. Rewired ARM and x86 handlers to support direct access to
I/EWRAM (and VRAM on ARM) to compensate. Overall performance is slightly
better but code is cleaner and allows for further improvements in the
dynarecs.
Uses a different cache primitive and a differend madd(u) encoding.
Also added a flag for BGR vs RGB color output (since PSP is assuming to
be BGR for speed).
Aside from that the ABI required some special function calls for PIC.
This allows us to emit the handlers directly in a more efficient manner.
At the same time it allows for an easy fix to emit PIC code, which is
necessary for libretro. This also enables more platform specific
optimizations and variations, perhaps even run-time multiplatform
support.
This removes libco and all the usages of it (+pthreads).
Rewired all dynarecs and interpreter to return after every frame so that
libretro can process events. This required to make dynarec re-entrant.
Dynarecs were updated to check for new frame on every update (IRQ, cycle
exhaustion, I/O write, etc). The performance impact of doing so should
be minimal (and definitely outweight the libco gains). While at it,
fixed small issues to get a bit more perf: arm dynarec was not idling
correctly, mips was using stack when not needed, etc.
Tested on PSP (mips), OGA (armv7), Linux (x86 and interpreter). Not
tested on Android though.
Affects at least SM Adv 4 on PSP, which doesn't load at all.
I think the MIPS pipeline does not like invalidating the Icache and
using it immediately after (seems to read an old value sometimes?).
Rewired it to not do that and instead jump to the handler directly.