The mismatch causes one-of errors for instruction 0F000000 (swieq 0)
which results in all sorts of funny errors (branches to bad addresses).
Fixes a couple of games.
Turns out there's a couple of inaccuracies that do affect a couple of
games. Most of them are buggy games but emulating these accesses
correctly helps jumping over some bugs.
This rewrites the way that CPU alerts work, making them a bitmap (since
multiple alerts can happen simultaneously, like SMC and IRQ). This
doesn't really fix many games but improves accuracy overall and improves
performance on some I/O writes (the ones without side effects).
The IRQ raising is now decoupled and explicitely called via a new
function (check_and_raise_interrupts) to avoid issues such as invalid
CPSR values (doesn't seem to bother most games!). There's more side
effects missing, so this just lays the ground for more fixes.
Whenever an interrupt is pending and interrupts are disabled (via
IME/IE), an IE/IME write that re-enables IRQs will fail to raise an IRQ.
This makes some games hang. Most games seem to use CPSR.IRQ to enable
and disable interruts, so they are not affected. However some others use
IME/IE (or all of them), causing these deadlocks and some race
conditions.
This fixes a bunch of games that did not crash but would "hang" in some
interesting ways.
This fixes many games and brings it closer to what the other dynarecs
do. Without this patch there's register corruption if the memory write
triggers an IRQ (since raise_interrupt mangles the registers that have
not been saved).
At this point there's still an issue with CPSR saving but that affects
aso the other dynarecs.
In thumb mode, a store that triggers a DMA and its correspondoing DMA
IRQ (since emulate DMAs being instantanious, which is another can of
worms tho...) overrwrites the PC with the PC-next value (disregarding
the IRQ handler address). This causes quite a few bugs. I'd expect it
causing way more bugs, but it doesnt...?
Anyway this fixes a few games like Buffy, Woody, Medabots, Super
Duper Sumos and Power Rangers
This fixes ROM swapping for x86/64, arm32 and arm64. On top of that it
improves speed by removing unnecessary slow paths on small ROMs for
arm32 and mips. If the ROM can fit in RAM, it will emit more efficient
code that assumes the ROM is fully loaded.
For low-memory Linux platforms it would be better to use some mmap'ed
ROM, that way the OS would transparently handle page swapping, which is
perhaps faster. Will investigate and follow up on this in a separate
commit.
VFS callbacks fail since it requires V2, which includes vfs_truncate. Otherwise it falls back to libretro-common code. Current VFS wrapping code in libretro-common needs V2 since vfs_truncate's callback is set.
https://github.com/libretro/libretro-common/blob/master/streams/file_stream.c#L65
Now VFS callbacks work properly in frontends that support them. Otherwise a hack of setting "cb->required_interface_version = 2" in frontend works. Only cores, according to specs, are meant to set the required version.
gpsp doesn't differentiate between USER and SYSTEM mode, most likely
cause it is not that important for most games. This implements the modes
correctly and adds checks for privileged operations. Still some
bugs/hacks but it mostly fixes CPSR/SPSR reads/writes.
To implement PSR writes we are using a more refined masks and force mode
bit num. 4 to always be one. Reserved bits are forced to zero (this
needs to be validated on a real device).
A missing usermode check (present in MIPS and x86) causes user-mode
returns to attempt returning into another CPU mode, which causes a bunch
of issues, mainly going into an invalid CPU state and corrupting some
registers.
This fixes a couple of games only (Colin McRae Rally 2, TOCA World
Touring, Starsky & Hutch ...)
(This is similar to 908fb8 but for memory regions)
Removes the weird offset encoding in favour of a metadata structure,
similar to what there was before. However this structure overlaps with
the cache ram itself and grows like a stack. This is to avoid wating
memory since most games only use a few blocks.
Simplify the "dual" block lookup routines too, since they are only an
extrypoint to the other two real modes.