Commit Graph

62 Commits

Author SHA1 Message Date
David Guillen Fandos f4e65d15c0 [arm/arm64] Fix OAM/VRAM byte writes
Forgot to add byte-doubling codepath for byte writes. This is correctly
implemented in x86 and MIPS.
2023-04-26 00:29:38 +02:00
David Guillen Fandos eb50c15b1c Remove CHANGED_PC_STATUS, simplify update flow 2023-04-24 20:24:03 +02:00
David Guillen Fandos 11d87b89df Rewrite I/O side effects write and IRQ triggers
This rewrites the way that CPU alerts work, making them a bitmap (since
multiple alerts can happen simultaneously, like SMC and IRQ). This
doesn't really fix many games but improves accuracy overall and improves
performance on some I/O writes (the ones without side effects).
The IRQ raising is now decoupled and explicitely called via a new
function (check_and_raise_interrupts) to avoid issues such as invalid
CPSR values (doesn't seem to bother most games!). There's more side
effects missing, so this just lays the ground for more fixes.
2023-04-14 01:41:55 +02:00
David Guillen Fandos 0e17e13c90 [arm] Fix I/O write epilogue
This fixes many games and brings it closer to what the other dynarecs
do. Without this patch there's register corruption if the memory write
triggers an IRQ (since raise_interrupt mangles the registers that have
not been saved).

At this point there's still an issue with CPSR saving but that affects
aso the other dynarecs.
2023-04-06 10:10:16 +02:00
David Guillen Fandos 541adc9e1c Fix ROM swapping capabilities
This fixes ROM swapping for x86/64, arm32 and arm64. On top of that it
improves speed by removing unnecessary slow paths on small ROMs for
arm32 and mips. If the ROM can fit in RAM, it will emit more efficient
code that assumes the ROM is fully loaded.

For low-memory Linux platforms it would be better to use some mmap'ed
ROM, that way the OS would transparently handle page swapping, which is
perhaps faster. Will investigate and follow up on this in a separate
commit.
2023-03-03 21:05:12 +01:00
David Guillen Fandos 4f3c9a5e58 [all] Fix CPSR and CPU modes
gpsp doesn't differentiate between USER and SYSTEM mode, most likely
cause it is not that important for most games. This implements the modes
correctly and adds checks for privileged operations. Still some
bugs/hacks but it mostly fixes CPSR/SPSR reads/writes.

To implement PSR writes we are using a more refined masks and force mode
bit num. 4 to always be one. Reserved bits are forced to zero (this
needs to be validated on a real device).
2023-01-11 21:26:32 +01:00
David Guillen Fandos 90170e3389 [arm] Fix usermode "movs pc" causing invalid CPU state
A missing usermode check (present in MIPS and x86) causes user-mode
returns to attempt returning into another CPU mode, which causes a bunch
of issues, mainly going into an invalid CPU state and corrupting some
registers.

This fixes a couple of games only (Colin McRae Rally 2, TOCA World
Touring, Starsky & Hutch ...)
2023-01-05 21:29:20 +01:00
David Guillen Fandos b552d5eb7e Improve open bus reads on ARM/MIPS 2023-01-05 21:29:20 +01:00
David Guillen Fandos 3a6ca8d941 Better cycle accounting, taking remainders partially into account 2021-12-21 19:59:33 +01:00
David Guillen Fandos a2aa78733d Further simplify stubs, remove unnecessary masking 2021-11-14 22:40:03 +01:00
David Guillen Fandos 6c4ffc4db2 [arm] Improve external stores and make them faster
While at it, speed up palette writes too.
2021-11-14 13:19:20 +01:00
David Guillen Fandos 3a7fedb8fb Simplify MMAP machinery for Win/Lin/Mac/Android
This gets rid of the bloated memmap_win32.c in favour of a much simpler
wrapper. This will be needed in the future since the wrapper does not
support MAP_FIXED maps (necessary for some platforms)
2021-11-05 18:23:05 +01:00
David Guillen Fandos c39f5391f0 [ARM] Start using mmap'ed JIT caches for most builds
Will revisit remaining platforms and try to clean them too.
This should fix Android once and for all.
2021-10-29 22:51:39 +02:00
David Guillen Fandos 5c1467cb63 [ARM] Remove cross calls between cache and text (far calls)
This is to allow cache to be mapped far from the regular .text section
2021-10-29 22:34:29 +02:00
David Guillen Fandos f65c3939b5 [ARM] Rewrite HLE emulation for div, make it faster and simpler.
Moves the handlers to the cache, removes C usage.
2021-10-24 18:00:06 +02:00
David Guillen Fandos b65df123f8 [ARM] Add support for ARMv5 2021-10-23 23:59:51 +02:00
David Guillen Fandos d558fb4fc4 [ARM] Rework memory handlers for speed amb simplicity
This removes one branch and emits the region selection code directly in
the JIT cache. Trading memory for speed (although it's not a big
improvement).

This is a step towards enabling MMAP caches in ARM (due to the 32MB
offset limitation in branches).
2021-10-23 23:33:15 +02:00
David Guillen Fandos 4bebb6135d Inline spsr operations 2021-10-23 12:02:46 +02:00
David Guillen Fandos cdb61227bc Fix arm32 spsr writes: only write selected bits
Seems no game is affected, even though the current implementration is
broken. Most games seem to not use mask at all.
2021-10-23 09:06:24 +02:00
David Guillen Fandos 3cab8596b8 Fix SWI handling (disable IRQs)
This introduced a potential race condition between the start of a SWI
and the BIOS handling the exception by returning to system mode. During
this ~10 instruction window, having an IRQ that issues a SWI causes bad
behaviour that results in crashes or other weirdness.
Fixes a couple of games and potentially many weird and obscure bugs here
and there (hard to reproduce sometimes).
2021-09-17 22:22:01 +02:00
David Guillen Fandos f51ed9de13 Improve SWI codepaths and implement div&divarm natively 2021-09-03 01:01:37 +02:00
David Guillen Fandos 55c6a69ccd Move flag regs to unserialized area
This is only used in x86 (mips and arm use native regs and never spill)
2021-08-28 17:50:25 +02:00
David Guillen Fandos 057b80f8cc Fix a bug with BLH (half BL) on ARM
This fixes issues on games from Camelot (GS2 and Mario Golf)
2021-08-15 22:19:40 +02:00
David Guillen Fandos 14bc6e3554 Remove unnecessary check in update stub (arm) 2021-08-15 21:52:26 +02:00
David Guillen Fandos 5be5015338 Rearrange register layout and exclude useless regs from savestat
This changes the savestate format once again.
2021-08-15 21:07:20 +02:00
David Guillen Fandos ce14e2585b Fix and reenable Android arm 32 bit builds
Removed the last bits of text relocations by moving all relevant RAMs to
the stub reachable area. This should be as fast or even faster than
previous code.
2021-07-31 17:45:46 +02:00
David Guillen Fandos b3abefa7d9 Avoid using relocations in arm code 2021-07-31 00:20:18 +02:00
David Guillen Fandos ab7d9bb161 Move membuffers close to dynarec area to fix x86 relocs
This essentially makes it easier to get a relocation-free text area for
x86 so that Android loaders are happy.
2021-07-28 19:12:43 +02:00
David Guillen Fandos 72a4a91fda Speed up arm stores 2021-07-15 21:27:38 +02:00
David Guillen Fandos ac3e75a107 Reimplement arm load stubs and fix BIOS handler
This fixes unallowed BIOS accesses (outside from BIOS) which fixes some
games like Silent Scope but also Zelda (fixes rolling as reported by
neonloop!)
2021-07-15 00:40:52 +02:00
David Guillen Fandos 221b8ff115 Partially revert 71ebc49b
Just move the complex lookup to C for simplicity since there's not a lot
of gain to have. This makes it easier for devices with weird jit caches.
2021-07-14 01:59:46 +02:00
David Guillen Fandos 3144d9e277 Rework ram block ptrs to remove second indirection table.
This removes ram_block_ptrs and encodes the pointer directly in the
block tag. Saves ~256KB at no performance cost.
Drawback is that it limits the ram cache size to 512KB (we were using
768KB before). Should not be a problem since most games use less than
32KB of cache anyway.

Fixed ARM routines accordingly.
2021-07-08 21:29:48 +02:00
David Guillen Fandos 2877886ff1 Fix ARM dynarec unaligned 32 bit loads
This might make a handful games slightly slower (but on the upper side
they work now instead of crashing or restarting).
Also while at it, fix some minor stuff in arm stubs for speed.
2021-05-17 01:16:56 +02:00
David Guillen Fandos 4fd456e158 Adding Code Breaker cheat support
This works on both interpreter and dynarec.
Tested in MIPS, ARM and x86, still needs some more testing, some edge
cases can be buggy.
2021-05-05 21:15:27 +02:00
David Guillen Fandos d83f8fbd25 Fix Vita port and likely some Linux/Android hidden issues
Using an invalid SP makes Vita crash (for an unkown reason) and makes
things like C signal handlers crash (luckily Retroarch doesn't use
them). It is also a violation of the ABI and not a great idea.
Recycled some little used registers to free SP. Perf should be roughly
the same.
2021-04-27 18:39:46 +02:00
David Guillen Fandos 5b5a4db6c2 Add instruction tracing, for testing purposes 2021-04-03 00:37:42 +02:00
David Guillen Fandos 8c14ac9619 Add function decorators for easier debugging / profiling 2021-04-02 02:10:00 +02:00
David Guillen Fandos 71ebc49b59 Improve indirect jumps in ARM
Handle already translated blocks in the ARM asm to speed up indirect
branches (affect some games more than others)
2021-03-30 21:06:52 +02:00
David Guillen Fandos 336b14a876 Improve ARM store handlers 2021-03-30 01:21:48 +02:00
David Guillen Fandos 452ba76ba8 Fix 16 bit RAM stores (VRAM and OAM) in ARM 2021-03-26 23:13:26 +01:00
David Guillen Fandos d284c868e9 Improve ARM store accesses 2021-03-26 23:13:26 +01:00
David Guillen Fandos 7ea6c5e247 Move OAM RAM to stubs also
Makes accesses more efficient for MIPS. Make accesses also fast for palette
reads.
2021-03-26 23:13:26 +01:00
David Guillen Fandos a494a3f00e Move OAM update flag to a register
Fix a small bug in MIPS dynarec that affects non -G0 targets
2021-03-26 23:13:26 +01:00
David Guillen Fandos ff510e7f7a Move caches to stub files to get around gcc 10
Seems that using the __atribute__ magic for sections is not the best way
of doing this, since it injects some default atributtes that collide
with the user defined ones. Using assembly is far easier in this case.

Reworked definitions a bit to make it easier to import from assembly.
Also wrapped stuff around macros for easy and less verbose
implementation of the symbol prefix issue.
2021-03-23 20:02:44 +01:00
David Guillen Fandos 11ec213c99 Make ewram memory lineal
This saves a few cycles in MIPS and simplifies a bit the core.
Removed the write map, only affects interpreter performance very
minimally. Rewired ARM and x86 handlers to support direct access to
I/EWRAM (and VRAM on ARM) to compensate. Overall performance is slightly
better but code is cleaner and allows for further improvements in the
dynarecs.
2021-03-23 19:09:56 +01:00
David Guillen Fandos 9de4220376 asm fixes for clang 2021-03-18 01:20:14 +01:00
David Guillen Fandos fb7ca09b01 Remove BIOS reserved translation area
This is not really necessary since it can share area with ROM.
Performance impact should be very minimal (haven't noticed it myself)
and could be compensated (even by a positive offset) if we bump the ROM
cache area size.
Tested with several dynarecs.
2021-03-17 18:33:02 +01:00
David Guillen Fandos 46cad2958a Move a few more registers to context
This gets rid of some more absolute addrs in the MIPS dynarec.
Tested on several platforms, we should be good.
2021-03-16 01:02:10 +01:00
David Guillen Fandos c86b9064df Move palettes around to simplify MIPS dynarec
Will move also OAM structures to gain a few cycles per load/store.
Loads can also be optimized for an extra instruction per access.
2021-03-15 02:25:02 +01:00
David Guillen Fandos 56dc6ecb70 Remove libco
This removes libco and all the usages of it (+pthreads).
Rewired all dynarecs and interpreter to return after every frame so that
libretro can process events. This required to make dynarec re-entrant.

Dynarecs were updated to check for new frame on every update (IRQ, cycle
exhaustion, I/O write, etc). The performance impact of doing so should
be minimal (and definitely outweight the libco gains). While at it,
fixed small issues to get a bit more perf: arm dynarec was not idling
correctly, mips was using stack when not needed, etc.

Tested on PSP (mips), OGA (armv7), Linux (x86 and interpreter). Not
tested on Android though.
2021-03-08 18:44:03 +01:00