Commit Graph

64 Commits

Author SHA1 Message Date
David Guillen Fandos b552d5eb7e Improve open bus reads on ARM/MIPS 2023-01-05 21:29:20 +01:00
David Guillen Fandos 3a6ca8d941 Better cycle accounting, taking remainders partially into account 2021-12-21 19:59:33 +01:00
David Guillen Fandos 9b25f26ed8 [x86] Fix division r3 sign value calculation 2021-12-13 22:26:49 +01:00
David Guillen Fandos 09cab16654 [x86] Do not generate unnecessary flags (optimization) 2021-12-13 18:31:01 +01:00
David Guillen Fandos 2419b77b28 Add reg tracing capability (for devs) 2021-12-11 11:27:59 +01:00
David Guillen Fandos 61ef776fed [x86] Fix CPSR store bug in LR register
There's a race condition on CPSR store (only if mode is changed) where,
if an IRQ is pending, the IRQ will be served, but the saved LR value
will be wrong (will skip the return instruction).
Fixed this and improved the logic a bit to make it faster and not use
unnecessary save slots.
2021-12-10 19:09:20 +01:00
David Guillen Fandos a435c712f8 [x86] Fix multiplication flags for 64 bit muls
This fixes a couple of games.
2021-12-06 18:15:56 +01:00
David Guillen Fandos 186950d0ad [x86] Minor simplifications 2021-11-07 19:19:19 +01:00
David Guillen Fandos e3d5ca8419 [x86/x64] Add support for x86-64 and improve 32 bit mode too.
This adds support for x86-64 dynarec both on Windows and Linux. Since
they have different requirements there's some macro magic in the stubs
file.

This also fixes x86 support in some cases: stack alignment requirements
where violated all over. This allows the usage of clang as a compiler
(which has a tendency to use SSE instructions more often than gcc does).

To support this I also reworked the mmap/VirtualAlloc magic to make sure
JIT arena stays close to .text.

Fixed some other minor issues and removed some unnecessary JIT code here
and there. clang tends to do some (wrong?) assumptions about global
symbols alignment.
2021-11-06 12:17:50 +01:00
David Guillen Fandos 3a7fedb8fb Simplify MMAP machinery for Win/Lin/Mac/Android
This gets rid of the bloated memmap_win32.c in favour of a much simpler
wrapper. This will be needed in the future since the wrapper does not
support MAP_FIXED maps (necessary for some platforms)
2021-11-05 18:23:05 +01:00
David Guillen Fandos 09a7afe216 [x86] Remove usage of esi register 2021-11-04 20:31:50 +01:00
David Guillen Fandos fc55198b76 [x86] Implement load handlers in asm stubs for speed 2021-11-03 22:20:31 +01:00
David Guillen Fandos 746503af95 [x86] Consolidate mem writes 2021-11-02 23:16:47 +01:00
David Guillen Fandos 9c99a918d0 Minor macro cleanups 2021-11-02 22:00:09 +01:00
David Guillen Fandos 6f81dbad1d Minor code cleanup to make it more readable 2021-11-02 21:14:38 +01:00
David Guillen Fandos 0d864be803 Simplify PSR stores for x86. Use only 2 function args 2021-11-02 21:02:05 +01:00
David Guillen Fandos 4d0d8dc42d [x86] Move reg_cycles to EBP, for better compatiblity 2021-11-02 20:11:46 +01:00
David Guillen Fandos 3f012afcda Make ROM hash table mechanism 64 bit compatible. 2021-10-30 22:54:51 +02:00
David Guillen Fandos 3cab8596b8 Fix SWI handling (disable IRQs)
This introduced a potential race condition between the start of a SWI
and the BIOS handling the exception by returning to system mode. During
this ~10 instruction window, having an IRQ that issues a SWI causes bad
behaviour that results in crashes or other weirdness.
Fixes a couple of games and potentially many weird and obscure bugs here
and there (hard to reproduce sometimes).
2021-09-17 22:22:01 +02:00
David Guillen Fandos ed89923fda Minor mem fixes around mirrors and u8/u16 accesses
Improve the open bus access a bit too.
2021-09-17 21:30:33 +02:00
David Guillen Fandos c2c2564e4a Simplify help functions in x86 backend 2021-09-17 20:00:37 +02:00
David Guillen Fandos 1eaf700ff6 Simplify conditional branches in x86
Make them more efficient while at it
2021-09-17 19:18:50 +02:00
David Guillen Fandos 9b0d685092 Fix divide by zero in the x86 BIOS HLE
Fixes ~10 games that divide zero by zero
2021-09-16 19:30:30 +02:00
David Guillen Fandos 33f1e25099 Emit BIOS SWI entrypoint to ROM arena
This fixes a race condition that happens whenever the ROM cache is flushed but
the RAM one is not, causing any SWI calls (implemented as direct branches) to
jump to random instructions.
The fix could be to flush both caches at the same time (~expensive on
low mem platforms), use indirect jumps (a bit expensive) or emit the SWI
handler below the watermark to ensure it is never flushed. This is cheap
and effective, requires minimal changes.
2021-09-10 00:30:55 +02:00
David Guillen Fandos ad6bf7f24a Minor x86 edits 2021-09-03 18:51:57 +02:00
David Guillen Fandos f51ed9de13 Improve SWI codepaths and implement div&divarm natively 2021-09-03 01:01:37 +02:00
David Guillen Fandos e0708b1dcf x86: Simplify thumb instructions and remove last function calls 2021-09-01 19:34:43 +02:00
David Guillen Fandos 8dda395c54 Implement asr/lsl/ror/lsr operand2 natively
This gives some small perf bump
2021-08-31 21:18:22 +02:00
David Guillen Fandos 66e011b0a3 Write most alu/log x86 operations as emitted code
This is around 8% perf improvement alone.
This also fixes many flag calculation/usage bugs (in corner cases) since
we use the x86 cpu native ALU flags (which are more or less the same as
ARM's). Passes all test ROMs for ALUs and no changes in game compat.
2021-08-31 00:42:47 +02:00
David Guillen Fandos 55c6a69ccd Move flag regs to unserialized area
This is only used in x86 (mips and arm use native regs and never spill)
2021-08-28 17:50:25 +02:00
David Guillen Fandos 8207775256 Fix out of bounds read bug on open bus read
This bug doesn't affect many games but makes sanitizers unhappy.
Also fix some minor FIFO clear bug
2021-08-24 19:55:37 +02:00
David Guillen Fandos f5232543f5 Improve tracing prints 2021-08-15 22:48:43 +02:00
David Guillen Fandos 5be5015338 Rearrange register layout and exclude useless regs from savestat
This changes the savestate format once again.
2021-08-15 21:07:20 +02:00
David Guillen Fandos ce14e2585b Fix and reenable Android arm 32 bit builds
Removed the last bits of text relocations by moving all relevant RAMs to
the stub reachable area. This should be as fast or even faster than
previous code.
2021-07-31 17:45:46 +02:00
David Guillen Fandos ab7d9bb161 Move membuffers close to dynarec area to fix x86 relocs
This essentially makes it easier to get a relocation-free text area for
x86 so that Android loaders are happy.
2021-07-28 19:12:43 +02:00
David Guillen Fandos b0947a1ae1 Promote nested functions to macros, fix clang builds
Add x86 Android builds back to the CI now that it's fixed (tested with
NDK r21)
2021-07-26 21:41:07 +02:00
David Guillen Fandos 37430f22c5 Small optimization (~2-4%) and whitespace cleanup!
Cleans up a ton of whitespace in cpu.c (like 100KB!) and improves
readability of some massive decode statements.

Added an optimization for PC-relative loads (pool load) in ROM (since
it's read only and cannot possibily change) that directly emits an
immediate load. This is way faster, specially in MIPS/x86, ARM can be
even faster if we rewrite the immediate load macros to also use a pool.
2021-05-07 20:41:54 +02:00
David Guillen Fandos 4fd456e158 Adding Code Breaker cheat support
This works on both interpreter and dynarec.
Tested in MIPS, ARM and x86, still needs some more testing, some edge
cases can be buggy.
2021-05-05 21:15:27 +02:00
David Guillen Fandos 5b5a4db6c2 Add instruction tracing, for testing purposes 2021-04-03 00:37:42 +02:00
David Guillen Fandos 7ea6c5e247 Move OAM RAM to stubs also
Makes accesses more efficient for MIPS. Make accesses also fast for palette
reads.
2021-03-26 23:13:26 +01:00
David Guillen Fandos a494a3f00e Move OAM update flag to a register
Fix a small bug in MIPS dynarec that affects non -G0 targets
2021-03-26 23:13:26 +01:00
David Guillen Fandos ff510e7f7a Move caches to stub files to get around gcc 10
Seems that using the __atribute__ magic for sections is not the best way
of doing this, since it injects some default atributtes that collide
with the user defined ones. Using assembly is far easier in this case.

Reworked definitions a bit to make it easier to import from assembly.
Also wrapped stuff around macros for easy and less verbose
implementation of the symbol prefix issue.
2021-03-23 20:02:44 +01:00
David Guillen Fandos 11ec213c99 Make ewram memory lineal
This saves a few cycles in MIPS and simplifies a bit the core.
Removed the write map, only affects interpreter performance very
minimally. Rewired ARM and x86 handlers to support direct access to
I/EWRAM (and VRAM on ARM) to compensate. Overall performance is slightly
better but code is cleaner and allows for further improvements in the
dynarecs.
2021-03-23 19:09:56 +01:00
Autechre 5ef784ab8a
Merge pull request #112 from davidgfnet/master
Enable runtime dynarec enable/disable
2021-03-18 03:15:03 +01:00
David Guillen Fandos 9de4220376 asm fixes for clang 2021-03-18 01:20:14 +01:00
David Guillen Fandos eab44b9e0b Enable runtime dynarec enable/disable
Added a more thorough cache cleanup for reset/mode-change too.
Fixed the mmap initialization that ends up leaking memory.
Minor x86 asm fixes for Android.
2021-03-17 21:05:49 +01:00
David Guillen Fandos 34e672ed25 Simplify open load handling for MIPS and fix other arches
Also rewrite a bit memory handlers for smaller functions.
2021-03-16 22:58:58 +01:00
David Guillen Fandos 46cad2958a Move a few more registers to context
This gets rid of some more absolute addrs in the MIPS dynarec.
Tested on several platforms, we should be good.
2021-03-16 01:02:10 +01:00
David Guillen Fandos c86b9064df Move palettes around to simplify MIPS dynarec
Will move also OAM structures to gain a few cycles per load/store.
Loads can also be optimized for an extra instruction per access.
2021-03-15 02:25:02 +01:00
David Guillen Fandos 56dc6ecb70 Remove libco
This removes libco and all the usages of it (+pthreads).
Rewired all dynarecs and interpreter to return after every frame so that
libretro can process events. This required to make dynarec re-entrant.

Dynarecs were updated to check for new frame on every update (IRQ, cycle
exhaustion, I/O write, etc). The performance impact of doing so should
be minimal (and definitely outweight the libco gains). While at it,
fixed small issues to get a bit more perf: arm dynarec was not idling
correctly, mips was using stack when not needed, etc.

Tested on PSP (mips), OGA (armv7), Linux (x86 and interpreter). Not
tested on Android though.
2021-03-08 18:44:03 +01:00