This improves the existing on-demand ROM paging and also breaks down ROM
buffers into 1MB blocks for platforms with memory fragmentation issues.
Fixed some potential RTC reg issue in said platforms too.
Fixed page pinning on interpreter (would crash due to LRU evictions).
This patch adds big-endian compatibility in gpsp (in general but only
for the interpreter). There's no performance hit for little-endian
platforms (should be a no-op) and only add a small overhead in memory
accesses for big-endian platforms.
Most memory accesses are wrapped with a byteswap instruction and I/O reg
accesses are also rewired for proper access (using macros). Video
rendering has been fixed to also do byteswaps but there's a couple of
games and rendering modes that still seem broken (but they amount to
less than 20 games in my tests with 1K ROMs).
This also adds build rules and CI for NGC/WII/WIIU (untested)
Cleans up a ton of whitespace in cpu.c (like 100KB!) and improves
readability of some massive decode statements.
Added an optimization for PC-relative loads (pool load) in ROM (since
it's read only and cannot possibily change) that directly emits an
immediate load. This is way faster, specially in MIPS/x86, ARM can be
even faster if we rewrite the immediate load macros to also use a pool.
An address check was missing to read aligned 32 (stm/ldm) data from
high mem areas (0xX0000000). This fixes SM4 EU that for some reason has
some weird memory access (perhaps a bug?)
This saves a few cycles in MIPS and simplifies a bit the core.
Removed the write map, only affects interpreter performance very
minimally. Rewired ARM and x86 handlers to support direct access to
I/EWRAM (and VRAM on ARM) to compensate. Overall performance is slightly
better but code is cleaner and allows for further improvements in the
dynarecs.
Add options to select whether to boot from BIOS (default is no, as it is
now) and whether to use the original bios or the builtin one (default is
auto, which tries to use the official but falls back to the builtin if
not found).
This is not really necessary since it can share area with ROM.
Performance impact should be very minimal (haven't noticed it myself)
and could be compensated (even by a positive offset) if we bump the ROM
cache area size.
Tested with several dynarecs.
This removes libco and all the usages of it (+pthreads).
Rewired all dynarecs and interpreter to return after every frame so that
libretro can process events. This required to make dynarec re-entrant.
Dynarecs were updated to check for new frame on every update (IRQ, cycle
exhaustion, I/O write, etc). The performance impact of doing so should
be minimal (and definitely outweight the libco gains). While at it,
fixed small issues to get a bit more perf: arm dynarec was not idling
correctly, mips was using stack when not needed, etc.
Tested on PSP (mips), OGA (armv7), Linux (x86 and interpreter). Not
tested on Android though.