Commit Graph

84 Commits

Author SHA1 Message Date
David Guillen Fandos 34eb7a3bf3 Fix reset() issue with dynarec flushing
On a reset bios_swi_entrypoint can end up pointing to code over the
watermark, due to block_lookup_address_arm looking up the function
instead of translating it.
Fix it by making flush and init different functions (albeit similar).

Tested by running and resetting games automatically, causing ~10% of
games to crash.
2023-06-09 20:21:35 +02:00
David Guillen Fandos 84c347edad [interp] Improve interpreter timings and honor WAITCNT
This fixes a few games and makes the interpreter faster (since it
doesn't run an overclocked CPU anymore).
2023-06-07 19:40:27 +02:00
David Guillen Fandos 2d28451f6c Fix SWI branch mismatch
The mismatch causes one-of errors for instruction 0F000000 (swieq 0)
which results in all sorts of funny errors (branches to bad addresses).
Fixes a couple of games.
2023-04-20 21:08:49 +02:00
David Guillen Fandos 9ef1a4d7b8 Fix tracing conditional ARM insts 2023-04-14 01:40:51 +02:00
David Guillen Fandos 4f3c9a5e58 [all] Fix CPSR and CPU modes
gpsp doesn't differentiate between USER and SYSTEM mode, most likely
cause it is not that important for most games. This implements the modes
correctly and adds checks for privileged operations. Still some
bugs/hacks but it mostly fixes CPSR/SPSR reads/writes.

To implement PSR writes we are using a more refined masks and force mode
bit num. 4 to always be one. Reserved bits are forced to zero (this
needs to be validated on a real device).
2023-01-11 21:26:32 +01:00
twinaphex e554360dd3 Fix for loop initial declaration errors 2022-01-19 19:00:57 +01:00
David Guillen Fandos b6ddec8fa0 Simplify lookup/translate logic to make it simpler. 2022-01-05 16:32:42 +01:00
David Guillen Fandos f597836abc Implement dual mode (arm/thumb) for RAM positions
(This is similar to 908fb8 but for memory regions)

Removes the weird offset encoding in favour of a metadata structure,
similar to what there was before. However this structure overlaps with
the cache ram itself and grows like a stack. This is to avoid wating
memory since most games only use a few blocks.
Simplify the "dual" block lookup routines too, since they are only an
extrypoint to the other two real modes.
2022-01-04 18:59:07 +01:00
David Guillen Fandos 8036ad5b50 Fix ram flush again! Wrapping mirrors are hard to track
Track PC on every iteration, round up a couple of instructions and align
the base address for speed.
2022-01-04 00:51:48 +01:00
David Guillen Fandos e71290e0ad Fix out of bounds RAM flush.
This can happen whenever the PC wraps around a mirror.
2022-01-03 01:23:51 +01:00
David Guillen Fandos 908fb831e0 Implement dynarec mode check for ROM code
This allows the emulator to recompile the same block as ARM and Thumb.
Some games do execute some code both as ARM and Thumb if you can believe
it! Also the dynarec can be a bit aggressive at pre-compiling some
blocks and can misunderstand branches in the wrong mode.

This fixes NBA Jam 2002 for all dynarec backends.
2022-01-03 01:18:47 +01:00
David Guillen Fandos daac3b7d91 Penalize HLE division, it's just too fast :) 2021-12-21 20:06:24 +01:00
David Guillen Fandos d0fd474777 [arm] Fix multiply (muls) and 64 bit mul where rlo==rhi
Seems rhi has precedence over rlo
2021-12-17 10:46:45 +01:00
David Guillen Fandos c6601d8932 [aarch64] Fix cache flushing out of bounds
Seems like some platforms don't like this very much :|
2021-12-13 18:54:21 +01:00
David Guillen Fandos fae9c7074b Fix dynarec flag optimization in Thumb mode
Usually blocks end with a branch, which also consumes all flags, but in
case the block is aborted early (or any other reason to not finish the
block on a branch), it will result in only a subset of flags being
generated, which causes problems in a couple of games.

This performs an out of bounds flag read, which is incorrect
2021-12-13 18:42:06 +01:00
David Guillen Fandos bcd3d1ca29 [aarch64] Adding new aarch64 dynarec!
This is based on the MIPS dynarec (more or less) with some ARM
borrowings. Seems to be quite fast (under my testing fixed results:
faster than ARM on A1 but not a lot faster than the interpreter on
Android Snapdragon 845) but still some optimizations are missing at the
moment.

Seems to pass my testing suite and compatibility wise is very similar to
arm.
2021-12-12 13:18:13 +01:00
David Guillen Fandos e3d5ca8419 [x86/x64] Add support for x86-64 and improve 32 bit mode too.
This adds support for x86-64 dynarec both on Windows and Linux. Since
they have different requirements there's some macro magic in the stubs
file.

This also fixes x86 support in some cases: stack alignment requirements
where violated all over. This allows the usage of clang as a compiler
(which has a tendency to use SSE instructions more often than gcc does).

To support this I also reworked the mmap/VirtualAlloc magic to make sure
JIT arena stays close to .text.

Fixed some other minor issues and removed some unnecessary JIT code here
and there. clang tends to do some (wrong?) assumptions about global
symbols alignment.
2021-11-06 12:17:50 +01:00
David Guillen Fandos 3a7fedb8fb Simplify MMAP machinery for Win/Lin/Mac/Android
This gets rid of the bloated memmap_win32.c in favour of a much simpler
wrapper. This will be needed in the future since the wrapper does not
support MAP_FIXED maps (necessary for some platforms)
2021-11-05 18:23:05 +01:00
David Guillen Fandos 15cc02e03c [MIPS] Move and restructure mips backend 2021-10-30 22:59:33 +02:00
David Guillen Fandos 3f012afcda Make ROM hash table mechanism 64 bit compatible. 2021-10-30 22:54:51 +02:00
David Guillen Fandos 6c195cdcaa Add libretro-common and VFS functions for read/write
Remove small unused stuff while at it.
2021-09-30 18:31:11 +02:00
David Guillen Fandos 9abb3ef934 Add printf flush to better capture crashes 2021-09-19 21:50:38 +02:00
David Guillen Fandos b7472eedf1 Add EOB translation gate to fix blocks that are too big
This fixes a couple of games only (AFAICS)
2021-09-19 21:35:13 +02:00
David Guillen Fandos 33f1e25099 Emit BIOS SWI entrypoint to ROM arena
This fixes a race condition that happens whenever the ROM cache is flushed but
the RAM one is not, causing any SWI calls (implemented as direct branches) to
jump to random instructions.
The fix could be to flush both caches at the same time (~expensive on
low mem platforms), use indirect jumps (a bit expensive) or emit the SWI
handler below the watermark to ensure it is never flushed. This is cheap
and effective, requires minimal changes.
2021-09-10 00:30:55 +02:00
David Guillen Fandos b431a8a4b6 Merge stub arena into ROM cache for simplicity. 2021-09-09 19:06:15 +02:00
David Guillen Fandos f51ed9de13 Improve SWI codepaths and implement div&divarm natively 2021-09-03 01:01:37 +02:00
David Guillen Fandos d649fe96cb Fix high/low ram watermark tracking
Fixes negative sized memset calls and some wrap around bugs. Fixes at
least a couple of games.
2021-08-28 16:49:41 +02:00
David Guillen Fandos 8e50b168cb Fix OOO access on last instruction.
Cycle counting is a bit broken, needs some rework.
2021-08-26 13:40:48 +02:00
David Guillen Fandos 1e976fb312 Remove unused stuff and fix const variables
Trying to figure out what needs to be part of a savestate :)
2021-08-24 10:57:30 +02:00
David Guillen Fandos 60155e0b81 Add preliminary support for PS2 devices 2021-07-22 18:30:45 +02:00
David Guillen Fandos a8d99d993f Fix potential MIPS issue on cache alignment 2021-07-19 18:55:42 +02:00
David Guillen Fandos 3144d9e277 Rework ram block ptrs to remove second indirection table.
This removes ram_block_ptrs and encodes the pointer directly in the
block tag. Saves ~256KB at no performance cost.
Drawback is that it limits the ram cache size to 512KB (we were using
768KB before). Should not be a problem since most games use less than
32KB of cache anyway.

Fixed ARM routines accordingly.
2021-07-08 21:29:48 +02:00
David Guillen Fandos 0ca87a4807 Fix conditional ARM instructions at the end of a translation block
This fixes issue #133

The explanation is as follows. Most blocks end on an inconditional
jump/branch, but there's two cases where this doesn't happen:
translation gates and when we hit MAX_EXITS. These are very uncommon
cases and therefore more prone to hidden bugs.

When this happens, the last instruction emits a conditional jump (via
arm_conditional_block_header macro) which is patched by a later
instruction via generate_branch_patch_conditional. Typically the last
unconditional branch will trigger the patching condition (which is
aproximately condition != last_condition), but in these two cases it
might not happen, leaving an unpatched branch. This makes x86 and ARM
dynarecs crash in interesting ways (although it might not crash
depending on $stuff and make the bug even harder to track).
2021-07-05 18:19:19 +02:00
David Guillen Fandos aafde6de7b Add ROM mirroring and fix mult. cycle count
This should correct some minor issues in some games.
2021-05-17 01:16:56 +02:00
David Guillen Fandos 37430f22c5 Small optimization (~2-4%) and whitespace cleanup!
Cleans up a ton of whitespace in cpu.c (like 100KB!) and improves
readability of some massive decode statements.

Added an optimization for PC-relative loads (pool load) in ROM (since
it's read only and cannot possibily change) that directly emits an
immediate load. This is way faster, specially in MIPS/x86, ARM can be
even faster if we rewrite the immediate load macros to also use a pool.
2021-05-07 20:41:54 +02:00
David Guillen Fandos 4fd456e158 Adding Code Breaker cheat support
This works on both interpreter and dynarec.
Tested in MIPS, ARM and x86, still needs some more testing, some edge
cases can be buggy.
2021-05-05 21:15:27 +02:00
David Guillen Fandos 5b5a4db6c2 Add instruction tracing, for testing purposes 2021-04-03 00:37:42 +02:00
David Guillen Fandos 71ebc49b59 Improve indirect jumps in ARM
Handle already translated blocks in the ARM asm to speed up indirect
branches (affect some games more than others)
2021-03-30 21:06:52 +02:00
David Guillen Fandos ff510e7f7a Move caches to stub files to get around gcc 10
Seems that using the __atribute__ magic for sections is not the best way
of doing this, since it injects some default atributtes that collide
with the user defined ones. Using assembly is far easier in this case.

Reworked definitions a bit to make it easier to import from assembly.
Also wrapped stuff around macros for easy and less verbose
implementation of the symbol prefix issue.
2021-03-23 20:02:44 +01:00
David Guillen Fandos 11ec213c99 Make ewram memory lineal
This saves a few cycles in MIPS and simplifies a bit the core.
Removed the write map, only affects interpreter performance very
minimally. Rewired ARM and x86 handlers to support direct access to
I/EWRAM (and VRAM on ARM) to compensate. Overall performance is slightly
better but code is cleaner and allows for further improvements in the
dynarecs.
2021-03-23 19:09:56 +01:00
David Guillen Fandos eab44b9e0b Enable runtime dynarec enable/disable
Added a more thorough cache cleanup for reset/mode-change too.
Fixed the mmap initialization that ends up leaking memory.
Minor x86 asm fixes for Android.
2021-03-17 21:05:49 +01:00
David Guillen Fandos fb7ca09b01 Remove BIOS reserved translation area
This is not really necessary since it can share area with ROM.
Performance impact should be very minimal (haven't noticed it myself)
and could be compensated (even by a positive offset) if we bump the ROM
cache area size.
Tested with several dynarecs.
2021-03-17 18:33:02 +01:00
David Guillen Fandos 5ffd2832e8 Rewrite of the MIPS dynarec stubs
This allows us to emit the handlers directly in a more efficient manner.
At the same time it allows for an easy fix to emit PIC code, which is
necessary for libretro. This also enables more platform specific
optimizations and variations, perhaps even run-time multiplatform
support.
2021-03-16 22:58:58 +01:00
David Guillen Fandos 1e8097ac79 Improve and simplify dynarec JIT area.
Also fix a regression on VITA.
Use gcc/OS cache flushing routines for MIPS32 instead of synci
2021-03-12 18:05:48 +01:00
David Guillen Fandos 462f0e9784 Improve cache flush magic
Make it better and more generic. Add support for MIPS32 and fix the
messy PSP code.
2021-03-12 01:46:09 +01:00
David Guillen Fandos 5127f4b5cc Remove PSP-specific stuff from MIPS backend
This is unnecessary since newlib supports all file I/O.
This is needed for other mips ports
2021-03-10 18:41:37 +01:00
David Guillen Fandos 0522d9a4f5 Add workaround for Android ARM builds
While we are at it, use ARM mode for better performance.
2021-03-09 19:29:18 +01:00
David Guillen Fandos 3d558413fd Fix x86 dynarec, broken by d10c4afe
The dynarec expects function args to be located in registers instead of
the stack, which is not the default calling convetion in GCC/clang.
2021-03-06 21:15:22 +01:00
David Guillen Fandos 89bd699837 Reduce executable size by 90%
Turns out most of that file ends up in JIT section, which is RWX and not
a very nice way to run code really (security issues aside).
This also makes possible to build that file with -ggdb otherwise it
complains about stuff.
2021-03-05 01:14:31 +01:00
David Guillen Fandos ed3ba2c18b More cleanups (mostly whitespace and unused stuff) 2021-02-15 21:51:49 +01:00