Removed the last bits of text relocations by moving all relevant RAMs to
the stub reachable area. This should be as fast or even faster than
previous code.
Seems that this could make VRAM overflow big time and overwrite IWRAM
most of the time (since it lives after that buffer). This causes a
variety of errors, some of them hidden if IWRAM not used.
Most games do not use VRAM mirror so it's not a big deal, but some do
have a off-by-one errors that trigger this.
This fixes unallowed BIOS accesses (outside from BIOS) which fixes some
games like Silent Scope but also Zelda (fixes rolling as reported by
neonloop!)
This improves the existing on-demand ROM paging and also breaks down ROM
buffers into 1MB blocks for platforms with memory fragmentation issues.
Fixed some potential RTC reg issue in said platforms too.
Fixed page pinning on interpreter (would crash due to LRU evictions).
This removes ram_block_ptrs and encodes the pointer directly in the
block tag. Saves ~256KB at no performance cost.
Drawback is that it limits the ram cache size to 512KB (we were using
768KB before). Should not be a problem since most games use less than
32KB of cache anyway.
Fixed ARM routines accordingly.
This is the format used by PS2.
This requires fixing the palette conversion routines (and palette writes
in the MIPS dynarec) but also adding support for 555 mode blending
(currently only 565 modes are supported, regardless of whether they are
RGB or BGR).
This fixes issue #133
The explanation is as follows. Most blocks end on an inconditional
jump/branch, but there's two cases where this doesn't happen:
translation gates and when we hit MAX_EXITS. These are very uncommon
cases and therefore more prone to hidden bugs.
When this happens, the last instruction emits a conditional jump (via
arm_conditional_block_header macro) which is patched by a later
instruction via generate_branch_patch_conditional. Typically the last
unconditional branch will trigger the patching condition (which is
aproximately condition != last_condition), but in these two cases it
might not happen, leaving an unpatched branch. This makes x86 and ARM
dynarecs crash in interesting ways (although it might not crash
depending on $stuff and make the bug even harder to track).
This patch adds big-endian compatibility in gpsp (in general but only
for the interpreter). There's no performance hit for little-endian
platforms (should be a no-op) and only add a small overhead in memory
accesses for big-endian platforms.
Most memory accesses are wrapped with a byteswap instruction and I/O reg
accesses are also rewired for proper access (using macros). Video
rendering has been fixed to also do byteswaps but there's a couple of
games and rendering modes that still seem broken (but they amount to
less than 20 games in my tests with 1K ROMs).
This also adds build rules and CI for NGC/WII/WIIU (untested)
This reduces code size more than 20% (which is 200-300KB!).
DMA handling accounts for less than 0.5% the average emulation runtime
which doesn't justify the crazy optimization level that the code has. In
fact it's more than likely that the new code runs faster due to less
I-cache trashing.
Seems that ABI mandates that we allocate space for arg0..4 even if we do
pass them as registers. For some reason write_io_register<> functions
write in that stack area (1 word) corrupting the s0 saved register.
This seems to be a new gcc behaviour?
This only needs some support to save/load state with 64 bit registers.
Since pointers remain 32 bit, no extra changes are needed in the
dynarec. Verified with qemu (qemu-mipsn32el) and miniretro.