This fixes a race condition that happens whenever the ROM cache is flushed but
the RAM one is not, causing any SWI calls (implemented as direct branches) to
jump to random instructions.
The fix could be to flush both caches at the same time (~expensive on
low mem platforms), use indirect jumps (a bit expensive) or emit the SWI
handler below the watermark to ensure it is never flushed. This is cheap
and effective, requires minimal changes.
This is around 8% perf improvement alone.
This also fixes many flag calculation/usage bugs (in corner cases) since
we use the x86 cpu native ALU flags (which are more or less the same as
ARM's). Passes all test ROMs for ALUs and no changes in game compat.
This mis-emits CMN instead of TEQ and TST in the reg-shift operand mode.
This is never used by gpsp directly but translating real tst opcodes,
hence it only affects games using such instruction.
This fixes video players that previously crashed, many games that had
graphical glitches in ARM mode (but not on other CPUs) usually in menus
or other dialogues. Also fixes games that either crashed or went blank
or similar issues. The extent of fixing is hard to determine but could
affect many games in different levels.
This uses BSON as savestate format, to allow external tools to parse it
(so that we can add proper test of the states). The BSON is not 100%
correct according to spec (no ordered keys) but can be parsed by most
libraries.
This fixes also a bug in the savestate palette color recalculation that
was wrongly overwritting the original palette (which could cause some
problems on some games).
Also fixes some potential issues by serializing some more stuff and
cleans up unused stuff.
Testing shows that states look good and there's only minor differences
in audio ticks, related to buffer sizes (since buffer flushes are
de-synced from video frames due to different frequency).
This is not needed at all, since the variables are not updated between
reload and end-of-frame (where we take our savestates). Added a reload
call during gba_load_state() to initialize it from the I/O regs.
Seems that, on IRQ, it is assumed that the PC will change (which happens
most of the time). However if the IRQ is masked it resumes execution.
On masked DMA IRQ the interpreter jumps to alert without incrementing
the instruction pointer (right after).
Removed the last bits of text relocations by moving all relevant RAMs to
the stub reachable area. This should be as fast or even faster than
previous code.
Seems that this could make VRAM overflow big time and overwrite IWRAM
most of the time (since it lives after that buffer). This causes a
variety of errors, some of them hidden if IWRAM not used.
Most games do not use VRAM mirror so it's not a big deal, but some do
have a off-by-one errors that trigger this.