Compare commits

..

1 Commits

Author SHA1 Message Date
crueter
ba9486fe18 revert #2834
revert [vk] Correct polygon draw topology mapping for line and point modes (#2834)

Co-authored-by: Ribbit <ribbit@placeholder.com>
Reviewed-on: https://git.eden-emu.dev/eden-emu/eden/pulls/2834
Reviewed-by: MaranBr <maranbr@eden-emu.dev>
Reviewed-by: crueter <crueter@eden-emu.dev>
Co-authored-by: Ribbit <ribbit@eden-emu.dev>
Co-committed-by: Ribbit <ribbit@eden-emu.dev>
2025-12-06 21:34:24 +01:00
105 changed files with 3694 additions and 2783 deletions

View File

@@ -487,6 +487,10 @@ else()
# wow
find_package(Boost 1.57.0 CONFIG REQUIRED OPTIONAL_COMPONENTS headers context system fiber filesystem)
if (CMAKE_SYSTEM_NAME STREQUAL "Linux" OR ANDROID)
find_package(gamemode 1.7 MODULE)
endif()
if (ENABLE_OPENSSL)
find_package(OpenSSL 1.1.1 REQUIRED)
endif()
@@ -562,7 +566,6 @@ find_package(VulkanUtilityLibraries)
find_package(SimpleIni)
find_package(SPIRV-Tools)
find_package(sirit)
find_package(gamemode)
if (ARCHITECTURE_x86 OR ARCHITECTURE_x86_64)
find_package(xbyak)

View File

@@ -1,154 +0,0 @@
# User Handbook - Adding Boolean Settings Toggles
> [!WARNING]
> This guide is intended for developers ONLY. If you are not a developer, this likely irrelevant to yourself.
>
> If you want to add temporary toggles, please refer to **[Adding Debug Knobs](AddingDebugKnobs.md)**
This guide will walk you through adding a new boolean toggle setting to Eden's configuration across both Qt's (PC) and Kotlin's (Android) UIs.
## Index
1. [Step 1 - src/common/settings](#step-1-src-common-settings)
2. [Qt's (PC) Steps](#qt-pc-steps)
* [Step 2 - src/qt_common/config/shared_translation.cpp](#step-2-src-qt_common-config-shared_translation-cpp)
3. [ Kotlin's (Android) Steps](#android-steps)
* [Step 3 - BooleanSetting.kt](#step-3-src-android-app-src-main-java-org-yuzu-yuzu_emu-features-settings-model-booleansetting-kt)
* [Step 4 - SettingsItem.kt](#step-4-src-android-app-src-main-java-org-yuzu-yuzu_emu-features-settings-model-view-settingsitem-kt)
* [Step 5 - SettingsFragmentPresenter.kt](#step-5-src-android-app-src-main-java-org-yuzu-yuzu_emu-features-settings-ui-settingsfragmentpresenter-kt)
* [Step 6 - strings.xml](#step-6-src-android-app-src-main-res-values-strings-xml)
4. [Step 7 - Use Your Toggle](#step-7-use-your-toggle)
5. [Best Practices](#best-practices)
---
## Step 1 - src/common/settings.
Firstly add your desired toggle inside `setting.h`,
Example:
```
SwitchableSetting<bool> your_setting_name{linkage, false, "your_setting_name", Category::RendererExtensions};
```
NOTE - If you wish for your toggle to be on by default then change `false` to `true` after `linkage,`.
### Remember to add your toggle to the appropriate category, for example:
Common Categories:
* Category::Renderer
* Category::RendererAdvanced
* Category::RendererExtensions
* Category::System
* Category::Core
---
## Qt (PC) Steps
### Step 2 - src/qt_common/config/shared_translation.cpp
Now you can add the toggle to the QT (PC) UI inside `shared_translation.cpp`,
Find where you wish for it to appear and place it there.
Example:
```
INSERT(Settings,
your_setting_name,
tr("Your Setting Display Name"),
tr("Detailed description of what this setting does.\n"
"You can use multiple lines.\n"
"Explain any caveats or requirements."));
```
### Make sure to:
* Keep display naming consistant
* Put detailed info in the description
* Use `\n` for line breaks in descriptions
---
## Android Steps
### Step 3 - src/android/app/src/main/java/org/yuzu/yuzu_emu/features/settings/model/BooleanSetting.kt
Now add it inside `BooleanSetting.kt` where it should be in the settings.
Example:
```
RENDERER_YOUR_SETTING_NAME("your_setting_name"),
```
Remember to make sure the naming of the prefix matches the desired category.
### Step 4 - src/android/app/src/main/java/org/yuzu/yuzu_emu/features/settings/model/view/SettingsItem.kt
Now you may add the toggle to the Kotlin (Android) UI inside `SettingsItem.kt`.
Example:
```
put(
SwitchSetting(
BooleanSetting.RENDERER_YOUR_SETTING_NAME,
titleId = R.string.your_setting_name,
descriptionId = R.string.your_setting_name_description
)
)
```
### Step 5 - src/android/app/src/main/java/org/yuzu/yuzu_emu/features/settings/ui/SettingsFragmentPresenter.kt
Now add your setting to the correct location inside `SettingsFragmentPresenter.kt` within the right category.
Example:
```
add(BooleanSetting.RENDERER_YOUR_SETTING_NAME.key)
```
Remember, placing matters! Settings appear in the order of where you add them.
### Step 6 - src/android/app/src/main/res/values/strings.xml
Now add your setting and description to `strings.xml` in the appropriate place.
Example:
```
<string name="your_setting_name">Your Setting Display Name</string>
<string name="your_setting_name_description">Detailed description of what this setting does. Explain any caveats, requirements, or warnings here.</string>
```
---
## Step 7 - Use Your Toggle!
Now the UI part is done find a place in the code for the toggle,
And use it to your heart's desire!
Example:
```
const bool your_value = Settings::values.your_setting_name.GetValue();
if (your_value) {
// Do something when enabled
}
```
If you wish to do something only when the toggle is disabled,
Use `if (!your_value) {` instead of `if (your_value) {`.
---
## Best Practices
* Naming - Use clear, descriptive names. Something for both the devs and the users.
* Defaults - Choose safe default values (usually false for new features).
* Documentation - Write clear descriptions explaining when and why to use the setting.
* Categories - Put settings in the appropriate category.
* Order - Place related settings near each other.
* Testing - Always test on both PC and Android before committing when possible.
### Thank you for reading, I hope this guide helped you making your toggle!

View File

@@ -1,119 +0,0 @@
# User Handbook - Adding Debug Knobs
Debug Knobs is a 16-bit integer setting (`debug_knobs`) in the Eden Emulator that serves as a bitmask for gating various testing and debugging features. This allows developers and advanced users to enable or disable specific debug behaviors without requiring deploying of complete but temporary toggles.
The setting ranges from 0 to 65535 (0x0000 to 0xFFFF), where each bit represents a different debug feature flag.
## Index
1. [Advantages](#advantages)
2. [Usage](#usage)
* [Accessing Debug Knobs (dev side)](#accessing-debug-knobs-dev-side)
* [Setting Debug Knobs (user side)](#setting-debug-knobs-user-side)
* [Bit Manipulation Examples](#bit-manipulation-examples)
3. [Examples](#examples)
* [Example 1: Conditional Debug Logging](#example-1-conditional-debug-logging)
* [Example 2: Performance Tuning](#example-2-performance-tuning)
* [Example 3: Feature Gating](#example-3-feature-gating)
4. [Best Practices](#best-practices)
---
## Advantages
The main advantage is to avoid deploying new disposable toggles (those made only for testing stage, and are disposed once new feature gets good to merge). This empowers devs to be free of all frontend burocracy and hassle of new toggles.
Common advantages recap:
* **Fine-Grained Control**: Enable or disable up to 16 individual debug features independently using bit manipulation on a single build
* **Runtime Configuration**: Change debug behavior at runtime the same way as new toggles would do
* **Safe incremental development**: New debug features can be added while impact can be isolated from previous deployments
## Usage
### Accessing Debug Knobs (dev side)
Use the `Settings::getDebugKnobAt(u8 i)` function to check if a specific bit is set:
```cpp
#include "common/settings.h"
// Check if bit 0 is set
bool feature_enabled = Settings::getDebugKnobAt(0);
// Check if bit 15 is set
bool another_feature = Settings::getDebugKnobAt(15);
```
The function returns `true` if the specified bit (0-15) is set in the `debug_knobs` value, `false` otherwise.
### Setting Debug Knobs (user side)
Developers must inform which knobs are tied to each functionality to be tested.
The debug knobs value can be set through:
1. **Desktop UI**: In the Debug configuration tab, there's a spinbox for "Debug knobs" (0-65535)
2. **Android UI**: Available as an integer setting in the Debug section
3. **Configuration Files**: Set the `debug_knobs` value in the emulator's configuration
### Bit Manipulation Examples
To enable specific features, calculate the decimal value by setting the appropriate bits:
* **Enable only bit 0**: Value = 1 (2^0)
* **Enable only bit 1**: Value = 2 (2^1)
* **Enable bits 0 and 1**: Value = 3 (2^0 + 2^1)
* **Enable bit 15**: Value = 32768 (2^15)
## Examples
### Example 1: Conditional Debug Logging
```cpp
void SomeFunction() {
if (Settings::getDebugKnobAt(0)) {
LOG_DEBUG(Common, "Debug feature 0 is enabled");
// Additional debug code here
}
if (Settings::getDebugKnobAt(1)) {
LOG_DEBUG(Common, "Debug feature 1 is enabled");
// Different debug behavior
}
}
```
### Example 2: Performance Tuning
```cpp
bool UseOptimizedPath() {
// Skip optimization if debug bit 2 is set for testing
return !Settings::getDebugKnobAt(2);
}
```
### Example 3: Feature Gating
```cpp
void ExperimentalFeature() {
static constexpr u8 EXPERIMENTAL_FEATURE_BIT = 3;
if (!Settings::getDebugKnobAt(EXPERIMENTAL_FEATURE_BIT)) {
// Fallback to stable implementation
StableImplementation();
return;
}
// Experimental implementation
ExperimentalImplementation();
}
```
## Best Practices
* This setting is intended for development and testing purposes only
* Knobs must be unwired before PR creation
* The setting is per-game configurable, allowing different debug setups for different titles

View File

@@ -14,4 +14,3 @@ This handbook is primarily aimed at the end-user - baking useful knowledge for e
- **[Orphaned Profiles](Orphaned.md)**
- **[Command Line](CommandLine.md)**
- **[Native Application Development](Native.md)**
- **[Adding Boolean Settings Toggles](AddingBooleanToggles.md)**

View File

@@ -267,11 +267,9 @@ if (ANDROID AND ARCHITECTURE_arm64)
AddJsonPackage(libadrenotools)
endif()
AddJsonPackage(gamemode)
if (gamemode_ADDED)
if (UNIX AND NOT APPLE AND NOT TARGET gamemode::headers)
add_library(gamemode INTERFACE)
target_include_directories(gamemode INTERFACE ${gamemode_SOURCE_DIR}/lib)
target_include_directories(gamemode INTERFACE gamemode)
add_library(gamemode::headers ALIAS gamemode)
endif()

View File

@@ -218,12 +218,5 @@
"artifact": "MoltenVK-macOS.tar",
"hash": "5695b36ca5775819a71791557fcb40a4a5ee4495be6b8442e0b666d0c436bec02aae68cc6210183f7a5c986bdbec0e117aecfad5396e496e9c2fd5c89133a347",
"bundled": true
},
"gamemode": {
"repo": "FeralInteractive/gamemode",
"sha": "ce6fe122f3",
"hash": "e87ec14ed3e826d578ebf095c41580069dda603792ba91efa84f45f4571a28f4d91889675055fd6f042d7dc25b0b9443daf70963ae463e38b11bcba95f4c65a9",
"version": "1.7",
"find_args": "MODULE"
}
}

376
externals/gamemode/gamemode_client.h vendored Normal file
View File

@@ -0,0 +1,376 @@
/*
Copyright (c) 2017-2019, Feral Interactive
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
* Neither the name of Feral Interactive nor the names of its contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.
*/
#ifndef CLIENT_GAMEMODE_H
#define CLIENT_GAMEMODE_H
/*
* GameMode supports the following client functions
* Requests are refcounted in the daemon
*
* int gamemode_request_start() - Request gamemode starts
* 0 if the request was sent successfully
* -1 if the request failed
*
* int gamemode_request_end() - Request gamemode ends
* 0 if the request was sent successfully
* -1 if the request failed
*
* GAMEMODE_AUTO can be defined to make the above two functions apply during static init and
* destruction, as appropriate. In this configuration, errors will be printed to stderr
*
* int gamemode_query_status() - Query the current status of gamemode
* 0 if gamemode is inactive
* 1 if gamemode is active
* 2 if gamemode is active and this client is registered
* -1 if the query failed
*
* int gamemode_request_start_for(pid_t pid) - Request gamemode starts for another process
* 0 if the request was sent successfully
* -1 if the request failed
* -2 if the request was rejected
*
* int gamemode_request_end_for(pid_t pid) - Request gamemode ends for another process
* 0 if the request was sent successfully
* -1 if the request failed
* -2 if the request was rejected
*
* int gamemode_query_status_for(pid_t pid) - Query status of gamemode for another process
* 0 if gamemode is inactive
* 1 if gamemode is active
* 2 if gamemode is active and this client is registered
* -1 if the query failed
*
* const char* gamemode_error_string() - Get an error string
* returns a string describing any of the above errors
*
* Note: All the above requests can be blocking - dbus requests can and will block while the daemon
* handles the request. It is not recommended to make these calls in performance critical code
*/
#include <stdbool.h>
#include <stdio.h>
#include <dlfcn.h>
#include <string.h>
#include <assert.h>
#include <sys/types.h>
static char internal_gamemode_client_error_string[512] = { 0 };
/**
* Load libgamemode dynamically to dislodge us from most dependencies.
* This allows clients to link and/or use this regardless of runtime.
* See SDL2 for an example of the reasoning behind this in terms of
* dynamic versioning as well.
*/
static volatile int internal_libgamemode_loaded = 1;
/* Typedefs for the functions to load */
typedef int (*api_call_return_int)(void);
typedef const char *(*api_call_return_cstring)(void);
typedef int (*api_call_pid_return_int)(pid_t);
/* Storage for functors */
static api_call_return_int REAL_internal_gamemode_request_start = NULL;
static api_call_return_int REAL_internal_gamemode_request_end = NULL;
static api_call_return_int REAL_internal_gamemode_query_status = NULL;
static api_call_return_cstring REAL_internal_gamemode_error_string = NULL;
static api_call_pid_return_int REAL_internal_gamemode_request_start_for = NULL;
static api_call_pid_return_int REAL_internal_gamemode_request_end_for = NULL;
static api_call_pid_return_int REAL_internal_gamemode_query_status_for = NULL;
/**
* Internal helper to perform the symbol binding safely.
*
* Returns 0 on success and -1 on failure
*/
__attribute__((always_inline)) static inline int internal_bind_libgamemode_symbol(
void *handle, const char *name, void **out_func, size_t func_size, bool required)
{
void *symbol_lookup = NULL;
char *dl_error = NULL;
/* Safely look up the symbol */
symbol_lookup = dlsym(handle, name);
dl_error = dlerror();
if (required && (dl_error || !symbol_lookup)) {
snprintf(internal_gamemode_client_error_string,
sizeof(internal_gamemode_client_error_string),
"dlsym failed - %s",
dl_error);
return -1;
}
/* Have the symbol correctly, copy it to make it usable */
memcpy(out_func, &symbol_lookup, func_size);
return 0;
}
/**
* Loads libgamemode and needed functions
*
* Returns 0 on success and -1 on failure
*/
__attribute__((always_inline)) static inline int internal_load_libgamemode(void)
{
/* We start at 1, 0 is a success and -1 is a fail */
if (internal_libgamemode_loaded != 1) {
return internal_libgamemode_loaded;
}
/* Anonymous struct type to define our bindings */
struct binding {
const char *name;
void **functor;
size_t func_size;
bool required;
} bindings[] = {
{ "real_gamemode_request_start",
(void **)&REAL_internal_gamemode_request_start,
sizeof(REAL_internal_gamemode_request_start),
true },
{ "real_gamemode_request_end",
(void **)&REAL_internal_gamemode_request_end,
sizeof(REAL_internal_gamemode_request_end),
true },
{ "real_gamemode_query_status",
(void **)&REAL_internal_gamemode_query_status,
sizeof(REAL_internal_gamemode_query_status),
false },
{ "real_gamemode_error_string",
(void **)&REAL_internal_gamemode_error_string,
sizeof(REAL_internal_gamemode_error_string),
true },
{ "real_gamemode_request_start_for",
(void **)&REAL_internal_gamemode_request_start_for,
sizeof(REAL_internal_gamemode_request_start_for),
false },
{ "real_gamemode_request_end_for",
(void **)&REAL_internal_gamemode_request_end_for,
sizeof(REAL_internal_gamemode_request_end_for),
false },
{ "real_gamemode_query_status_for",
(void **)&REAL_internal_gamemode_query_status_for,
sizeof(REAL_internal_gamemode_query_status_for),
false },
};
void *libgamemode = NULL;
/* Try and load libgamemode */
libgamemode = dlopen("libgamemode.so.0", RTLD_NOW);
if (!libgamemode) {
/* Attempt to load unversioned library for compatibility with older
* versions (as of writing, there are no ABI changes between the two -
* this may need to change if ever ABI-breaking changes are made) */
libgamemode = dlopen("libgamemode.so", RTLD_NOW);
if (!libgamemode) {
snprintf(internal_gamemode_client_error_string,
sizeof(internal_gamemode_client_error_string),
"dlopen failed - %s",
dlerror());
internal_libgamemode_loaded = -1;
return -1;
}
}
/* Attempt to bind all symbols */
for (size_t i = 0; i < sizeof(bindings) / sizeof(bindings[0]); i++) {
struct binding *binder = &bindings[i];
if (internal_bind_libgamemode_symbol(libgamemode,
binder->name,
binder->functor,
binder->func_size,
binder->required)) {
internal_libgamemode_loaded = -1;
return -1;
};
}
/* Success */
internal_libgamemode_loaded = 0;
return 0;
}
/**
* Redirect to the real libgamemode
*/
__attribute__((always_inline)) static inline const char *gamemode_error_string(void)
{
/* If we fail to load the system gamemode, or we have an error string already, return our error
* string instead of diverting to the system version */
if (internal_load_libgamemode() < 0 || internal_gamemode_client_error_string[0] != '\0') {
return internal_gamemode_client_error_string;
}
/* Assert for static analyser that the function is not NULL */
assert(REAL_internal_gamemode_error_string != NULL);
return REAL_internal_gamemode_error_string();
}
/**
* Redirect to the real libgamemode
* Allow automatically requesting game mode
* Also prints errors as they happen.
*/
#ifdef GAMEMODE_AUTO
__attribute__((constructor))
#else
__attribute__((always_inline)) static inline
#endif
int gamemode_request_start(void)
{
/* Need to load gamemode */
if (internal_load_libgamemode() < 0) {
#ifdef GAMEMODE_AUTO
fprintf(stderr, "gamemodeauto: %s\n", gamemode_error_string());
#endif
return -1;
}
/* Assert for static analyser that the function is not NULL */
assert(REAL_internal_gamemode_request_start != NULL);
if (REAL_internal_gamemode_request_start() < 0) {
#ifdef GAMEMODE_AUTO
fprintf(stderr, "gamemodeauto: %s\n", gamemode_error_string());
#endif
return -1;
}
return 0;
}
/* Redirect to the real libgamemode */
#ifdef GAMEMODE_AUTO
__attribute__((destructor))
#else
__attribute__((always_inline)) static inline
#endif
int gamemode_request_end(void)
{
/* Need to load gamemode */
if (internal_load_libgamemode() < 0) {
#ifdef GAMEMODE_AUTO
fprintf(stderr, "gamemodeauto: %s\n", gamemode_error_string());
#endif
return -1;
}
/* Assert for static analyser that the function is not NULL */
assert(REAL_internal_gamemode_request_end != NULL);
if (REAL_internal_gamemode_request_end() < 0) {
#ifdef GAMEMODE_AUTO
fprintf(stderr, "gamemodeauto: %s\n", gamemode_error_string());
#endif
return -1;
}
return 0;
}
/* Redirect to the real libgamemode */
__attribute__((always_inline)) static inline int gamemode_query_status(void)
{
/* Need to load gamemode */
if (internal_load_libgamemode() < 0) {
return -1;
}
if (REAL_internal_gamemode_query_status == NULL) {
snprintf(internal_gamemode_client_error_string,
sizeof(internal_gamemode_client_error_string),
"gamemode_query_status missing (older host?)");
return -1;
}
return REAL_internal_gamemode_query_status();
}
/* Redirect to the real libgamemode */
__attribute__((always_inline)) static inline int gamemode_request_start_for(pid_t pid)
{
/* Need to load gamemode */
if (internal_load_libgamemode() < 0) {
return -1;
}
if (REAL_internal_gamemode_request_start_for == NULL) {
snprintf(internal_gamemode_client_error_string,
sizeof(internal_gamemode_client_error_string),
"gamemode_request_start_for missing (older host?)");
return -1;
}
return REAL_internal_gamemode_request_start_for(pid);
}
/* Redirect to the real libgamemode */
__attribute__((always_inline)) static inline int gamemode_request_end_for(pid_t pid)
{
/* Need to load gamemode */
if (internal_load_libgamemode() < 0) {
return -1;
}
if (REAL_internal_gamemode_request_end_for == NULL) {
snprintf(internal_gamemode_client_error_string,
sizeof(internal_gamemode_client_error_string),
"gamemode_request_end_for missing (older host?)");
return -1;
}
return REAL_internal_gamemode_request_end_for(pid);
}
/* Redirect to the real libgamemode */
__attribute__((always_inline)) static inline int gamemode_query_status_for(pid_t pid)
{
/* Need to load gamemode */
if (internal_load_libgamemode() < 0) {
return -1;
}
if (REAL_internal_gamemode_query_status_for == NULL) {
snprintf(internal_gamemode_client_error_string,
sizeof(internal_gamemode_client_error_string),
"gamemode_query_status_for missing (older host?)");
return -1;
}
return REAL_internal_gamemode_query_status_for(pid);
}
#endif // CLIENT_GAMEMODE_H

View File

@@ -24,6 +24,7 @@ SPDX-License-Identifier: GPL-3.0-or-later
<uses-permission android:name="android.permission.POST_NOTIFICATIONS" />
<uses-permission android:name="android.permission.ACCESS_WIFI_STATE" />
<uses-permission android:name="android.permission.VIBRATE" />
<uses-permission android:name="android.permission.WRITE_SECURE_SETTINGS" android:required="false" />
<uses-permission android:name="android.permission.BLUETOOTH_CONNECT" />
<uses-permission android:name="android.permission.BLUETOOTH_SCAN" />
<uses-permission android:name="android.permission.FOREGROUND_SERVICE" />

View File

@@ -25,6 +25,7 @@ enum class BooleanSetting(override val key: String) : AbstractBooleanSetting {
RENDERER_ASYNCHRONOUS_SHADERS("use_asynchronous_shaders"),
RENDERER_FAST_GPU("use_fast_gpu_time"),
RENDERER_REACTIVE_FLUSHING("use_reactive_flushing"),
RENDERER_EARLY_RELEASE_FENCES("early_release_fences"),
SYNC_MEMORY_OPERATIONS("sync_memory_operations"),
BUFFER_REORDER_DISABLE("disable_buffer_reorder"),
RENDERER_DEBUG("debug"),
@@ -51,6 +52,10 @@ enum class BooleanSetting(override val key: String) : AbstractBooleanSetting {
SHOW_FW_VERSION("show_firmware_version"),
SOC_OVERLAY_BACKGROUND("soc_overlay_background"),
FRAME_INTERPOLATION("frame_interpolation"),
// FRAME_SKIPPING("frame_skipping"),
ENABLE_INPUT_OVERLAY_AUTO_HIDE("enable_input_overlay_auto_hide"),
PERF_OVERLAY_BACKGROUND("perf_overlay_background"),

View File

@@ -236,6 +236,21 @@ abstract class SettingsItem(
override fun reset() = BooleanSetting.USE_DOCKED_MODE.reset()
}
put(
SwitchSetting(
BooleanSetting.FRAME_INTERPOLATION,
titleId = R.string.frame_interpolation,
descriptionId = R.string.frame_interpolation_description
)
)
// put(
// SwitchSetting(
// BooleanSetting.FRAME_SKIPPING,
// titleId = R.string.frame_skipping,
// descriptionId = R.string.frame_skipping_description
// )
// )
put(
SwitchSetting(
@@ -689,6 +704,13 @@ abstract class SettingsItem(
descriptionId = R.string.renderer_reactive_flushing_description
)
)
put(
SwitchSetting(
BooleanSetting.RENDERER_EARLY_RELEASE_FENCES,
titleId = R.string.renderer_early_release_fences,
descriptionId = R.string.renderer_early_release_fences_description
)
)
put(
SwitchSetting(
BooleanSetting.SYNC_MEMORY_OPERATIONS,

View File

@@ -460,8 +460,10 @@ class SettingsFragmentPresenter(
add(IntSetting.RENDERER_SAMPLE_SHADING_FRACTION.key)
add(HeaderSetting(R.string.veil_renderer))
add(BooleanSetting.RENDERER_EARLY_RELEASE_FENCES.key)
add(IntSetting.DMA_ACCURACY.key)
add(BooleanSetting.BUFFER_REORDER_DISABLE.key)
add(BooleanSetting.FRAME_INTERPOLATION.key)
add(BooleanSetting.RENDERER_FAST_GPU.key)
add(IntSetting.FAST_GPU_TIME.key)
add(IntSetting.RENDERER_SHADER_BACKEND.key)

View File

@@ -1127,7 +1127,14 @@ class EmulationFragment : Fragment(), SurfaceHolder.Callback {
val actualFps = perfStats[FPS]
if (BooleanSetting.SHOW_FPS.getBoolean(needsGlobal)) {
val enableFrameInterpolation =
BooleanSetting.FRAME_INTERPOLATION.getBoolean()
// val enableFrameSkipping = BooleanSetting.FRAME_SKIPPING.getBoolean()
var fpsText = String.format("FPS: %.1f", actualFps)
if (enableFrameInterpolation) {
fpsText = String.format("eFPS: %.1f", actualFps)
}
sb.append(fpsText)
}

View File

@@ -1,6 +1,6 @@
<com.google.android.material.button.MaterialButton xmlns:android="http://schemas.android.com/apk/res/android"
xmlns:app="http://schemas.android.com/apk/res-auto"
android:layout_width="175dp"
android:layout_width="170dp"
android:layout_height="55dp"
android:layout_marginBottom="16dp"
app:iconTint="?attr/colorOnPrimary"

View File

@@ -55,7 +55,7 @@
<com.google.android.material.textview.MaterialTextView
android:id="@+id/text_confirmation"
style="@style/SynthwaveText.Accent"
android:layout_width="wrap_content"
android:layout_width="213dp"
android:layout_height="226dp"
android:paddingHorizontal="16dp"
android:paddingTop="24dp"

View File

@@ -98,6 +98,10 @@
<string name="sample_shading_fraction_description">كثافة تمرير تظليل العينة. تؤدي القيم الأعلى إلى تحسين الجودة بشكل أكبر، ولكنها تقلل أيضًا من الأداء إلى حد كبير.</string>
<string name="veil_renderer">العارض</string>
<string name="frame_interpolation">تحسين سرعة الإطارات</string>
<string name="frame_interpolation_description">يضمن تسليمًا سلسًا ومتناسقًا للإطارات من خلال مزامنة التوقيت بينها، مما يقلل من التقطيع وعدم انتظام الحركة. مثالي للألعاب التي تعاني من عدم استقرار في توقيت الإطارات أو تقطع دقيق أثناء اللعب.</string>
<string name="renderer_early_release_fences">إطلاق الأسوار مبكرًا</string>
<string name="renderer_early_release_fences_description">يساعد في إصلاح مشكلة 0 إطار في الثانية في ألعاب مثل DKCR:HD وSubnautica Below Zero وOri 2، ولكن قد يتسبب في تعطيل التحميل أو الأداء في ألعاب Unreal Engine.</string>
<string name="sync_memory_operations">مزامنة عمليات الذاكرة</string>
<string name="sync_memory_operations_description">يضمن اتساق البيانات بين عمليات الحوسبة والذاكرة. هذا الخيار قد يحل المشكلات في بعض الألعاب، ولكن قد يقلل الأداء في بعض الحالات. يبدو أن الألعاب التي تستخدم Unreal Engine 4 هي الأكثر تأثرًا.</string>
<string name="buffer_reorder_disable">تعطيل إعادة ترتيب المخزن المؤقت</string>
@@ -468,6 +472,8 @@
<string name="display">الشاشة</string>
<string name="processing">تأثيرات بعد المعالجة</string>
<string name="frame_skipping">قيد التطوير: تخطي الإطارات</string>
<string name="frame_skipping_description">تبديل تخطي الإطارات لتحسين الأداء عن طريق تقليل عدد الإطارات المعروضة. هذه الميزة قيد التطوير وسيتم تمكينها في الإصدارات المستقبلية.</string>
<string name="renderer_accuracy">مستوى الدقة</string>
<string name="renderer_resolution">الدقة (الإرساء/محمول)</string>
<string name="renderer_vsync">VSync وضع</string>
@@ -878,6 +884,11 @@
<string name="renderer_vulkan">Vulkan</string>
<string name="renderer_none">لاشيء</string>
<!-- Renderer Accuracy -->
<string name="renderer_accuracy_normal">عادي</string>
<string name="renderer_accuracy_high">عالي</string>
<string name="renderer_accuracy_extreme">أقصى</string>
<!-- DMA Accuracy -->
<string name="dma_accuracy_default">افتراضي</string>
<string name="dma_accuracy_unsafe">غير آمن</string>

View File

@@ -76,6 +76,10 @@
<string name="sample_shading_fraction_description">چڕی تێپەڕاندنی سێبەرکردنی نموونە. بەهای زیاتر کوالێتی باشتر دەکات بەڵام کارایی زیاتر کەم دەکاتەوە.</string>
<string name="veil_renderer">رێندرەر</string>
<string name="frame_interpolation">تحسين توقيت الإطارات</string>
<string name="frame_interpolation_description">يضمن تسليمًا سلسًا ومتناسقًا للإطارات من خلال مزامنة التوقيت بينها، مما يقلل من التقطيع وعدم انتظام الحركة. مثالي للألعاب التي تعاني من عدم استقرار في توقيت الإطارات أو تقطع دقيق أثناء اللعب.</string>
<string name="renderer_early_release_fences">زێدەکردنی پەرستارەکان زووتر</string>
<string name="renderer_early_release_fences_description">یارمەتی دەدات لە چارەسەری 0 FPS لە یارییەکانی وەک DKCR:HD، Subnautica Below Zero و Ori 2، بەڵام ڕەنگە بارکردن یان کارایی لە یارییەکانی Unreal Engine تێکبدات.</string>
<string name="sync_memory_operations">هاوبەشیی کردارەکانی بیرگە</string>
<string name="sync_memory_operations_description">دڵنیایی داتا لە نێوان کردارەکانی کۆمپیوتەر و بیرگە. ئەم هەڵبژاردە کێشەکان لە هەندێک یاری چارەسەر دەکات، بەڵام لە هەندێک حاڵەت کارایی کەم دەکاتەوە. وا دیارە یارییەکانی Unreal Engine 4 زۆرترین کاریگەریان هەیە.</string>
<string name="buffer_reorder_disable">ڕێکخستنەوەی بافر ناچالاک بکە</string>
@@ -378,6 +382,8 @@
<string name="display">پیشاندان</string>
<string name="processing">پاشپڕۆسەکردن</string>
<string name="frame_skipping">قيد التطوير: تخطي الإطارات</string>
<string name="frame_skipping_description">تێپەڕاندنی فرەیمەکان بکە بۆ باشترکردنی کارایی بە کەمکردنەوەی ژمارەی فرەیمە ڕێندرکراوەکان. ئەم تایبەتمەندییە هێشتا کاردەکرێت و لە وەشانە داهاتووەکاندا چالاکدەکرێت.</string>
<string name="renderer_accuracy">ئاستی وردبینی</string>
<string name="renderer_resolution">ڕوونی (دۆخی دەستی/دۆخی دۆک)</string>
<string name="renderer_vsync">دۆخی VSync</string>
@@ -623,6 +629,9 @@
<string name="renderer_vulkan">ڤوڵکان</string>
<string name="renderer_none">هیچ</string>
<!-- Renderer Accuracy -->
<string name="renderer_accuracy_normal">ئاسایی</string>
<string name="renderer_accuracy_high">بەرز</string>
<!-- DMA Accuracy -->
<string name="dma_accuracy_default">بنەڕەتی</string>
<!-- ASTC Decoding Method -->

View File

@@ -76,6 +76,10 @@
<string name="sample_shading_fraction_description">Intenzita průchodu stínování vzorku. Vyšší hodnoty zlepšují kvalitu, ale také výrazněji snižují výkon.</string>
<string name="veil_renderer">Renderer</string>
<string name="frame_interpolation">Vylepšené časování snímků</string>
<string name="frame_interpolation_description">Zajišťuje plynulé a konzistentní zobrazování snímků synchronizací jejich časování, čímž snižuje trhání a nerovnoměrné animace. Ideální pro hry, které trpí nestabilitou časování snímků nebo mikrotrháním během hraní.</string>
<string name="renderer_early_release_fences">Uvolnit ploty brzy</string>
<string name="renderer_early_release_fences_description">Pomáhá opravit 0 FPS v hrách jako DKCR:HD, Subnautica Below Zero a Ori 2, ale může narušit načítání nebo výkon v hrách na Unreal Engine.</string>
<string name="sync_memory_operations">Synchronizace paměťových operací</string>
<string name="sync_memory_operations_description">Zajišťuje konzistenci dat mezi výpočetními a paměťovými operacemi. Tato volba by měla opravit problémy v některých hrách, ale může v některých případech snížit výkon. Nejvíce postižené se zdají být hry s Unreal Engine 4.</string>
<string name="buffer_reorder_disable">Zakázat přeřazování vyrovnávací paměti</string>
@@ -366,6 +370,8 @@
<string name="display">Zobrazení</string>
<string name="processing">Postprocesing</string>
<string name="frame_skipping">WIP: Přeskočení snímků</string>
<string name="frame_skipping_description">Přepínání přeskočení snímků pro zlepšení výkonu snížením počtu vykreslených snímků. Tato funkce je stále ve vývoji a bude povolena v budoucích verzích.</string>
<string name="renderer_accuracy">Úroveň přesnosti</string>
<string name="renderer_resolution">Rozlišení (Handheld/Docked)</string>
<string name="renderer_vsync">VSync režim</string>
@@ -610,6 +616,9 @@
<string name="renderer_vulkan">Vulkan</string>
<string name="renderer_none">Žádné</string>
<!-- Renderer Accuracy -->
<string name="renderer_accuracy_normal">Normální</string>
<string name="renderer_accuracy_high">Vysoká</string>
<!-- DMA Accuracy -->
<string name="dma_accuracy_default">Výchozí</string>
<!-- ASTC Decoding Method -->

View File

@@ -84,6 +84,10 @@
<string name="sample_shading_fraction_description">Die Intensität des Sample-Shading-Durchgangs. Höhere Werte verbessern die Qualität stärker, beeinträchtigen aber auch die Leistung stärker.</string>
<string name="veil_renderer">Renderer</string>
<string name="frame_interpolation">Erweiterte Frame-Synchronisation</string>
<string name="frame_interpolation_description">Sorgt für eine gleichmäßige und konsistente Frame-Wiedergabe durch Synchronisierung der Frame-Zeiten, was Ruckeln und ungleichmäßige Animationen reduziert. Ideal für Spiele, die unter instabilen Frame-Zeiten oder Mikrorucklern leiden.</string>
<string name="renderer_early_release_fences">Zäune früher freigeben</string>
<string name="renderer_early_release_fences_description">Behebt 0 FPS in Spielen wie DKCR:HD, Subnautica Below Zero und Ori 2, kann aber Ladezeiten oder Performance in Unreal Engine-Spielen beeinträchtigen.</string>
<string name="sync_memory_operations">Speicheroperationen synchronisieren</string>
<string name="sync_memory_operations_description">Stellt die Datenkonsistenz zwischen Compute- und Speicheroperationen sicher. Diese Option sollte Probleme in einigen Spielen beheben, kann aber in einigen Fällen die Leistung verringern. Spiele mit Unreal Engine 4 scheinen am stärksten betroffen zu sein.</string>
<string name="buffer_reorder_disable">Puffer-Neuanordnung deaktivieren</string>
@@ -419,6 +423,12 @@ Wird der Handheld-Modus verwendet, verringert es die Auflösung und erhöht die
<!-- Graphics settings strings -->
<string name="backend">Backend</string>
<string name="display">Anzeige</string>
<string name="processing">Nachbearbeitung</string>
<string name="frame_skipping">WIP: Frame Skipping</string>
<string name="frame_skipping_description">Aktivieren Sie Frame Skipping, um die Leistung durch Reduzierung der gerenderten Frames zu verbessern. Diese Funktion wird noch entwickelt und in zukünftigen Versionen verfügbar sein.</string>
<string name="renderer_accuracy">Genauigkeitsstufe</string>
<string name="renderer_resolution">Auflösung (Handheld/Gedockt)</string>
<string name="renderer_vsync">VSync-Modus</string>
<string name="renderer_screen_layout">Ausrichtung</string>
@@ -788,6 +798,9 @@ Wirklich fortfahren?</string>
<string name="renderer_vulkan">Vulkan</string>
<string name="renderer_none">Keiner</string>
<!-- Renderer Accuracy -->
<string name="renderer_accuracy_normal">Normal</string>
<string name="renderer_accuracy_high">Hoch</string>
<!-- DMA Accuracy -->
<string name="dma_accuracy_default">Standard</string>
<!-- ASTC Decoding Method -->

View File

@@ -98,6 +98,10 @@
<string name="sample_shading_fraction_description">La intensidad del paso de sombreado de la muestra. Los valores más altos mejoran más la calidad, pero también reducen el rendimiento en mayor medida.</string>
<string name="veil_renderer">Renderizador</string>
<string name="frame_interpolation">Ritmo de fotogramas mejorado</string>
<string name="frame_interpolation_description">Garantiza una entrega de fotogramas fluida y consistente al sincronizar el tiempo entre fotogramas, reduciendo la tartamudez y la animación desigual. Ideal para juegos que experimentan inestabilidad en el tiempo de fotogramas o microtartamudeos durante el juego.</string>
<string name="renderer_early_release_fences">Liberar las vallas antes</string>
<string name="renderer_early_release_fences_description">Ayuda a arreglar 0 FPS en juegos como DKCR:HD, Subnautica Below Zero y Ori 2, pero puede romper la carga o el rendimiento en juegos de Unreal Engine.</string>
<string name="sync_memory_operations">Sincronizar operaciones de memoria</string>
<string name="sync_memory_operations_description">Garantiza la consistencia de los datos entre las operaciones de computación y memoria. Esta opción debería solucionar problemas en algunos juegos, pero también puede reducir el rendimiento en algunos casos. Los juegos de Unreal Engine 4 a menudo ven los cambios más significativos de los mismos.</string>
<string name="buffer_reorder_disable">Desactivar reordenamiento de búfer</string>
@@ -442,6 +446,8 @@
<string name="display">Pantalla</string>
<string name="processing">Postprocesado</string>
<string name="frame_skipping">WIP: Salto de fotogramas</string>
<string name="frame_skipping_description">Activa o desactiva el salto de fotogramas para mejorar el rendimiento reduciendo el número de fotogramas renderizados. Esta función está en desarrollo y se habilitará en futuras versiones.</string>
<string name="renderer_accuracy">Nivel de precisión</string>
<string name="renderer_resolution">Resolución (Portátil/Sobremesa)</string>
<string name="renderer_vsync">Modo VSync</string>
@@ -837,6 +843,11 @@
<string name="renderer_vulkan">Vulkan</string>
<string name="renderer_none">Ninguno</string>
<!-- Renderer Accuracy -->
<string name="renderer_accuracy_normal">Normal</string>
<string name="renderer_accuracy_high">Alto</string>
<string name="renderer_accuracy_extreme">Extremo</string>
<!-- DMA Accuracy -->
<string name="dma_accuracy_default">Predeterminado</string>
<string name="dma_accuracy_unsafe">Inseguro</string>

View File

@@ -65,6 +65,12 @@
<string name="veil_misc">پردازنده و حافظه</string>
<string name="eden_veil">پرده عدن</string>
<string name="eden_veil_description">تنظیمات آزمایشی برای بهبود عملکرد و قابلیت. این تنظیمات ممکن است باعث نمایش صفحه سیاه یا سایر مشکلات بازی شود.</string>
<string name="frame_skipping">در حال توسعه: رد کردن فریم‌ها</string>
<string name="frame_skipping_description">با فعال کردن رد کردن فریم‌ها، عملکرد را با کاهش تعداد فریم‌های رندر شده بهبود دهید. این قابلیت در حال توسعه است و در نسخه‌های آینده فعال خواهد شد.</string>
<string name="frame_interpolation">زمان‌بندی پیشرفته فریم‌ها</string>
<string name="frame_interpolation_description">ارسال یکنواخت و پایدار فریم‌ها را با همگام‌سازی زمان بین آن‌ها تضمین می‌کند، که منجر به کاهش لرزش و انیمیشن‌های ناهموار می‌شود. برای بازی‌هایی که ناپایداری در زمان‌بندی فریم‌ها یا میکرو لرزش در حین بازی دارند ایده‌آل است</string>
<string name="renderer_early_release_fences">رهاسازی حصارها زودتر</string>
<string name="renderer_early_release_fences_description">به رفع مشکل 0 فریم بر ثانیه در بازی‌هایی مانند DKCR:HD، Subnautica Below Zero و Ori 2 کمک می‌کند، اما ممکن است بارگذاری یا عملکرد بازی‌های Unreal Engine را مختل کند.</string>
<string name="sync_memory_operations">همگام‌سازی عملیات حافظه</string>
<string name="sync_memory_operations_description">اطمینان از سازگاری داده‌ها بین عملیات محاسباتی و حافظه. این گزینه ممکن است مشکلات برخی بازی‌ها را رفع کند، اما در برخی موارد ممکن است عملکرد را کاهش دهد. به نظر می‌رسد بازی‌های با Unreal Engine 4 بیشترین تأثیر را داشته باشند.</string>
<string name="buffer_reorder_disable">غیرفعال کردن مرتب‌سازی مجدد بافر</string>
@@ -758,6 +764,11 @@
<string name="renderer_vulkan">Vulkan</string>
<string name="renderer_none">هیچ</string>
<!-- Renderer Accuracy -->
<string name="renderer_accuracy_normal">معمولی</string>
<string name="renderer_accuracy_high">زیاد</string>
<string name="renderer_accuracy_extreme">افراطی (کند)</string>
<!-- DMA Accuracy -->
<string name="dma_accuracy_default">پیش فرض</string>
<string name="dma_accuracy_unsafe">ناایمن (سریع)</string>

View File

@@ -98,6 +98,10 @@
<string name="sample_shading_fraction_description">L\'intensité de la passe d\'ombrage d\'échantillon. Des valeurs plus élevées améliorent davantage la qualité mais réduisent aussi plus fortement les performances.</string>
<string name="veil_renderer">Rendu</string>
<string name="frame_interpolation">Synchronisation avancée des frames</string>
<string name="frame_interpolation_description">Assure une diffusion fluide et régulière des frames en synchronisant leur timing, réduisant ainsi les saccades et les animations irrégulières. Idéal pour les jeux souffrant d`instabilité de timing des frames ou de micro-saccades pendant le jeu.</string>
<string name="renderer_early_release_fences">Libérer les barrières plus tôt</string>
<string name="renderer_early_release_fences_description">Résout les problèmes de 0 FPS dans des jeux comme DKCR:HD, Subnautica Below Zero et Ori 2, mais peut perturber le chargement ou les performances des jeux Unreal Engine.</string>
<string name="sync_memory_operations">Synchroniser les opérations mémoire</string>
<string name="sync_memory_operations_description">Garantit la cohérence des données entre les opérations de calcul et de mémoire. Cette option devrait résoudre les problèmes dans certains jeux, mais peut réduire les performances dans certains cas. Les jeux utilisant Unreal Engine 4 semblent être les plus affectés.</string>
<string name="buffer_reorder_disable">Désactiver le réordonnancement du tampon</string>
@@ -443,6 +447,8 @@
<string name="display">Affichage</string>
<string name="processing">Post-traitement</string>
<string name="frame_skipping">WIP: Saut de frames</string>
<string name="frame_skipping_description">Activez ou désactivez le saut d\'images pour améliorer les performances en réduisant le nombre d\'images affichées. Cette fonctionnalité est en cours de développement et sera activée dans les futures versions.</string>
<string name="renderer_accuracy">Niveau de précision</string>
<string name="renderer_resolution">Résolution (Mode Portable/Mode TV)</string>
<string name="renderer_vsync">Mode VSync</string>
@@ -850,6 +856,11 @@
<string name="renderer_vulkan">Vulkan</string>
<string name="renderer_none">Aucune</string>
<!-- Renderer Accuracy -->
<string name="renderer_accuracy_normal">Normal</string>
<string name="renderer_accuracy_high">Haut</string>
<string name="renderer_accuracy_extreme">Extrême</string>
<!-- DMA Accuracy -->
<string name="dma_accuracy_default">Défaut</string>
<string name="dma_accuracy_unsafe">Dangereux</string>

View File

@@ -76,6 +76,10 @@
<string name="sample_shading_fraction_description">עוצמת מעבר ההצללה לדוגמה. ערכים גבוהים יותר משפרים את האיכות יותר אך גם מפחיתים את הביצועים במידה רבה יותר.</string>
<string name="veil_renderer">רנדרר</string>
<string name="frame_interpolation">סנכרון פריימים מתקדם</string>
<string name="frame_interpolation_description">מבטיח אספקה חלקה ועקבית של פריימים על ידי סנכרון התזמון ביניהם, מפחית קפיצות ואנימציה לא אחידה. אידיאלי למשחקים עם בעיות בתזמון פריימים או מיקרו-קפיצות במהלך המשחק.</string>
<string name="renderer_early_release_fences">שחרר גדרות מוקדם</string>
<string name="renderer_early_release_fences_description">עוזר לתקן 0 FPS במשחקים כמו DKCR:HD, Subnautica Below Zero ו-Ori 2, אך עלול לפגוע בטעינה או בביצועים במשחקי Unreal Engine.</string>
<string name="sync_memory_operations">סנכרון פעולות זיכרון</string>
<string name="sync_memory_operations_description">מבטיח עקביות נתונים בין פעולות חישוב וזיכרון. אפשרות זו אמורה לתקן בעיות במשחקים מסוימים, אך עלולה להפחית ביצועים במקרים מסוימים. נראה שהמשחקים עם Unreal Engine 4 הם המושפעים ביותר.</string>
<string name="buffer_reorder_disable">השבת סידור מחדש של חוצץ</string>
@@ -402,6 +406,8 @@
<string name="display">תצוגה</string>
<string name="processing">עיבוד לאחר</string>
<string name="frame_skipping">בעבודה: דילוג פריימים</string>
<string name="frame_skipping_description">החלף דילוג על פריימים כדי לשפר ביצועים על ידי הפחתת מספר הפריימים המוצגים. תכונה זו עדיין בפיתוח ותופעל בגרסאות עתידיות.</string>
<string name="renderer_accuracy">רמת דיוק</string>
<string name="renderer_resolution">רזולוציה (מעוגן/נייד)</string>
<string name="renderer_vsync">מצב VSync</string>
@@ -664,6 +670,9 @@
<string name="renderer_vulkan">Vulkan</string>
<string name="renderer_none">אין שום דבר</string>
<!-- Renderer Accuracy -->
<string name="renderer_accuracy_normal">רגיל</string>
<string name="renderer_accuracy_high">גבוה</string>
<!-- DMA Accuracy -->
<string name="dma_accuracy_default">ברירת מחדל</string>
<!-- ASTC Decoding Method -->

View File

@@ -76,6 +76,10 @@
<string name="sample_shading_fraction_description">A mintavételezés árnyékolási lépés intenzitása. A magasabb értékek jobb minőséget eredményeznek, de nagyobb mértékben csökkentik a teljesítményt.</string>
<string name="veil_renderer">Megjelenítő</string>
<string name="frame_interpolation">Továbbfejlesztett Képkocka-időzítés</string>
<string name="frame_interpolation_description">Biztosítja a képkockák sima és egyenletes kézbesítését azok időzítésének szinkronizálásával, csökkentve a megakadásokat és egyenetlen animációkat. Ideális azokhoz a játékokhoz, amelyek képkocka-időzítési instabilitást vagy mikro-reccsenést tapasztalnak játék közben.</string>
<string name="renderer_early_release_fences">Korai kerítés-felszabadítás</string>
<string name="renderer_early_release_fences_description">Segít javítani a 0 FPS-t olyan játékokban, mint a DKCR:HD, Subnautica Below Zero és az Ori 2, de ronthatja az Unreal Engine játékok betöltését vagy teljesítményét.</string>
<string name="sync_memory_operations">Memória-műveletek szinkronizálása</string>
<string name="sync_memory_operations_description">Biztosítja az adatok konzisztenciáját a számítási és memória-műveletek között. Ez az opciónak javítania kell néhány játékban előforduló problémát, de bizonyos esetekben csökkentheti a teljesítményt. Az Unreal Engine 4-et használó játékok látszanak a legérintettebbek.</string>
<string name="buffer_reorder_disable">Puffer újrarendezés letiltása</string>
@@ -397,6 +401,8 @@
<string name="display">Kijelző</string>
<string name="processing">Utófeldolgozás</string>
<string name="frame_skipping">Folyamatban: Képkihagyás</string>
<string name="frame_skipping_description">Kapcsolja be a képkihagyást a teljesítmény javításához a renderelt képkockák számának csökkentésével. Ez a funkció még fejlesztés alatt áll, és a jövőbeli kiadásokban lesz elérhető.</string>
<string name="renderer_accuracy">Pontosság szintje</string>
<string name="renderer_resolution">Felbontás (Kézi/Dockolt)</string>
<string name="renderer_vsync">VSync mód</string>
@@ -759,6 +765,9 @@
<string name="renderer_vulkan">Vulkan</string>
<string name="renderer_none">Nincs</string>
<!-- Renderer Accuracy -->
<string name="renderer_accuracy_normal">Normál</string>
<string name="renderer_accuracy_high">Magas</string>
<!-- DMA Accuracy -->
<string name="dma_accuracy_default">Alapértelmezett</string>
<!-- ASTC Decoding Method -->

View File

@@ -98,6 +98,10 @@
<string name="sample_shading_fraction_description">Intensitas proses pencahayaan sampel. Nilai lebih tinggi meningkatkan kualitas lebih baik tetapi juga mengurangi performa lebih besar.</string>
<string name="veil_renderer">Renderer</string>
<string name="frame_interpolation">Penyelarasan Frame Tingkat Lanjut</string>
<string name="frame_interpolation_description">Memastikan pengiriman frame yang halus dan konsisten dengan menyinkronkan waktu antar frame, mengurangi stuttering dan animasi tidak rata. Ideal untuk game yang mengalami ketidakstabilan waktu frame atau micro-stutter selama gameplay.</string>
<string name="renderer_early_release_fences">Lepas Pagar Lebih Awal</string>
<string name="renderer_early_release_fences_description">Membantu memperbaiki 0 FPS di game seperti DKCR:HD, Subnautica Below Zero dan Ori 2, tapi mungkin mengganggu loading atau performa di game Unreal Engine.</string>
<string name="sync_memory_operations">Sinkronisasi Operasi Memori</string>
<string name="sync_memory_operations_description">Memastikan konsistensi data antara operasi komputasi dan memori. Opsi ini seharusnya memperbaiki masalah di beberapa game, tetapi mungkin mengurangi performa dalam beberapa kasus. Game dengan Unreal Engine 4 tampaknya yang paling terpengaruh.</string>
<string name="buffer_reorder_disable">Nonaktifkan Penyusunan Ulang Buffer</string>
@@ -433,6 +437,8 @@
<string name="display">Tampilan</string>
<string name="processing">Pascaproses</string>
<string name="frame_skipping">WIP: Loncatan Frame</string>
<string name="frame_skipping_description">Aktifkan atau nonaktifkan frame skipping untuk meningkatkan performa dengan mengurangi jumlah frame yang dirender. Fitur ini masih dalam pengembangan dan akan diaktifkan di rilis mendatang.</string>
<string name="renderer_accuracy">Tingkatan Akurasi</string>
<string name="renderer_resolution">Resolusi (Handheld/Docked)</string>
<string name="renderer_vsync">Mode Sinkronisasi Vertikal</string>
@@ -809,6 +815,9 @@
<string name="renderer_vulkan">Vulkan</string>
<string name="renderer_none">Tak ada</string>
<!-- Renderer Accuracy -->
<string name="renderer_accuracy_normal">Normal</string>
<string name="renderer_accuracy_high">Tinggi</string>
<!-- DMA Accuracy -->
<string name="dma_accuracy_default">Bawaan</string>
<!-- ASTC Decoding Method -->

View File

@@ -98,6 +98,10 @@
<string name="sample_shading_fraction_description">L\'intensità della passata di ombreggiatura campione. Valori più alti migliorano la qualità ma riducono maggiormente le prestazioni.</string>
<string name="veil_renderer">Renderer</string>
<string name="frame_interpolation">Sincronizzazione avanzata fotogrammi</string>
<string name="frame_interpolation_description">Garantisce una consegna fluida e costante dei fotogrammi sincronizzandone i tempi, riducendo scatti e animazioni irregolari. Ideale per giochi che presentano instabilità nei tempi dei fotogrammi o micro-scatti durante il gameplay.</string>
<string name="renderer_early_release_fences">Rilascia le barriere prima</string>
<string name="renderer_early_release_fences_description">Risolve problemi di 0 FPS in giochi come DKCR:HD, Subnautica Below Zero e Ori 2, ma potrebbe compromettere caricamento o prestazioni in giochi Unreal Engine.</string>
<string name="sync_memory_operations">Sincronizza operazioni di memoria</string>
<string name="sync_memory_operations_description">Garantisce la coerenza dei dati tra le operazioni di calcolo e memoria. Questa opzione dovrebbe risolvere problemi in alcuni giochi, ma potrebbe ridurre le prestazioni in alcuni casi. I giochi con Unreal Engine 4 sembrano essere i più colpiti.</string>
<string name="buffer_reorder_disable">Disabilita riordino buffer</string>
@@ -443,6 +447,8 @@
<string name="display">Schermo</string>
<string name="processing">Post-elaborazione</string>
<string name="frame_skipping">WIP: Salto fotogrammi</string>
<string name="frame_skipping_description">Attiva o disattiva il salto dei fotogrammi per migliorare le prestazioni riducendo il numero di fotogrammi renderizzati. Questa funzionalità è ancora in sviluppo e verrà abilitata nelle versioni future.</string>
<string name="renderer_accuracy">Livello di accuratezza</string>
<string name="renderer_resolution">Risoluzione (Portatile/Docked)</string>
<string name="renderer_vsync">Modalità VSync</string>
@@ -850,6 +856,11 @@
<string name="renderer_vulkan">Vulkan</string>
<string name="renderer_none">Nessuna</string>
<!-- Renderer Accuracy -->
<string name="renderer_accuracy_normal">Normale</string>
<string name="renderer_accuracy_high">Alta</string>
<string name="renderer_accuracy_extreme">Estrema</string>
<!-- DMA Accuracy -->
<string name="dma_accuracy_default">Predefinito</string>
<string name="dma_accuracy_unsafe">Non sicura</string>

View File

@@ -76,6 +76,10 @@
<string name="sample_shading_fraction_description">サンプルシェーディング処理の強度。高い値ほど品質は向上しますが、パフォーマンスも大きく低下します。</string>
<string name="veil_renderer">レンダラー</string>
<string name="frame_interpolation">高度なフレーム同期</string>
<string name="frame_interpolation_description">フレーム間のタイミングを同期させることで、スムーズで一貫したフレーム配信を確保し、カクつきや不均一なアニメーションを軽減します。フレームタイミングの不安定さやマイクロスタッターが発生するゲームに最適です。</string>
<string name="renderer_early_release_fences">フェンスを早期に解放</string>
<string name="renderer_early_release_fences_description">DKCR:HD、Subnautica Below Zero、Ori 2などのゲームで0 FPSを修正しますが、Unreal Engineゲームの読み込みやパフォーマンスに影響する可能性があります。</string>
<string name="sync_memory_operations">メモリ操作の同期</string>
<string name="sync_memory_operations_description">計算処理とメモリ操作間のデータ一貫性を保証します。 このオプションは一部のゲームの問題を修正しますが、場合によってはパフォーマンスが低下する可能性があります。 Unreal Engine 4のゲームが最も影響を受けるようです。</string>
<string name="buffer_reorder_disable">バッファの再並べ替えを無効化</string>
@@ -397,6 +401,8 @@
<string name="display">ディスプレイ</string>
<string name="processing">後処理</string>
<string name="frame_skipping">WIP: フレームスキップ</string>
<string name="frame_skipping_description">フレームスキップを切り替えて、レンダリングされるフレーム数を減らしパフォーマンスを向上させます。この機能は開発中であり、今後のリリースで有効になります。</string>
<string name="renderer_accuracy">精度</string>
<string name="renderer_resolution">解像度(携帯モード/TVモード</string>
<string name="renderer_vsync">垂直同期モード</string>
@@ -664,6 +670,9 @@
<string name="renderer_vulkan">Vulkan</string>
<string name="renderer_none">なし</string>
<!-- Renderer Accuracy -->
<string name="renderer_accuracy_normal">標準</string>
<string name="renderer_accuracy_high"></string>
<!-- DMA Accuracy -->
<string name="dma_accuracy_default">デフォルト</string>
<!-- ASTC Decoding Method -->

View File

@@ -76,6 +76,10 @@
<string name="sample_shading_fraction_description">샘플 쉐이딩 패스의 강도. 값이 높을수록 품질이 더 향상되지만 성능도 더 크게 저하됩니다.</string>
<string name="veil_renderer">렌더러</string>
<string name="frame_interpolation">향상된 프레임 페이싱</string>
<string name="frame_interpolation_description">프레임 간 타이밍을 동기화하여 부드럽고 일관된 프레임 전달을 보장하며, 끊김과 불균일한 애니메이션을 줄입니다. 프레임 타이밍 불안정이나 게임 플레이 중 미세 끊김이 발생하는 게임에 이상적입니다.</string>
<string name="renderer_early_release_fences">펜스 조기 해제</string>
<string name="renderer_early_release_fences_description">DKCR:HD, Subnautica Below Zero, Ori 2 등의 게임에서 0 FPS 현상을 해결하지만, Unreal Engine 게임의 로딩이나 성능에 문제를 일으킬 수 있습니다.</string>
<string name="sync_memory_operations">메모리 작업 동기화</string>
<string name="sync_memory_operations_description">컴퓨팅 및 메모리 작업 간 데이터 일관성을 보장합니다. 이 옵션은 일부 게임의 문제를 해결할 수 있지만 경우에 따라 성능이 저하될 수 있습니다. Unreal Engine 4 게임이 가장 큰 영향을 받는 것으로 보입니다.</string>
<string name="buffer_reorder_disable">버퍼 재정렬 비활성화</string>
@@ -397,6 +401,8 @@
<string name="display">디스플레이</string>
<string name="processing">후처리</string>
<string name="frame_skipping">작업 중: 프레임 스킵</string>
<string name="frame_skipping_description">렌더링되는 프레임 수를 줄여 성능을 향상시키기 위해 프레임 스킵을 전환합니다. 이 기능은 현재 개발 중이며 향후 출시 버전에서 활성화될 예정입니다.</string>
<string name="renderer_accuracy">정확도 수준</string>
<string name="renderer_resolution">해상도 (휴대 모드/독 모드)</string>
<string name="renderer_vsync">수직동기화 모드</string>
@@ -718,6 +724,9 @@
<string name="renderer_vulkan">Vulcan</string>
<string name="renderer_none">없음</string>
<!-- Renderer Accuracy -->
<string name="renderer_accuracy_normal">보통</string>
<string name="renderer_accuracy_high">높음</string>
<!-- DMA Accuracy -->
<string name="dma_accuracy_default">기본값</string>
<!-- ASTC Decoding Method -->

View File

@@ -76,6 +76,10 @@
<string name="sample_shading_fraction_description">Intensiteten til prøveskyggepasseringen. Høyere verdier forbedrer kvaliteten mer, men reduserer også ytelsen i større grad.</string>
<string name="veil_renderer">Renderer</string>
<string name="frame_interpolation">Avansert bildevindu-synkronisering</string>
<string name="frame_interpolation_description">Sikrer jevn og konsekvent bildelevering ved å synkronisere tiden mellom bilder, noe som reduserer hakking og ujevn animasjon. Ideelt for spill som opplever ustabil bildetid eller mikro-hakk under spilling.</string>
<string name="renderer_early_release_fences">Frigjør gjerder tidlig</string>
<string name="renderer_early_release_fences_description">Løser 0 FPS i spill som DKCR:HD, Subnautica Below Zero og Ori 2, men kan forårsake problemer med lasting eller ytelse i Unreal Engine-spill.</string>
<string name="sync_memory_operations">Synkroniser minneoperasjoner</string>
<string name="sync_memory_operations_description">Sikrer datakonsistens mellom beregnings- og minneoperasjoner. Dette alternativet bør fikse problemer i noen spill, men kan redusere ytelsen i noen tilfeller. Spill med Unreal Engine 4 ser ut til å være de mest berørte.</string>
<string name="buffer_reorder_disable">Deaktiver bufferomorganisering</string>
@@ -378,6 +382,8 @@
<string name="display">Skjerm</string>
<string name="processing">Etterbehandling</string>
<string name="frame_skipping">WIP: Hoppe over bilder</string>
<string name="frame_skipping_description">Slå av/på frame skipping for å forbedre ytelsen ved å redusere antall renderte bilder. Denne funksjonen er fortsatt under utvikling og vil bli aktivert i fremtidige versjoner.</string>
<string name="renderer_accuracy">Nøyaktighetsnivå</string>
<string name="renderer_resolution">Oppløsning (håndholdt/dokket)</string>
<string name="renderer_vsync">VSync-modus</string>
@@ -636,6 +642,9 @@
<string name="renderer_vulkan">Vulkan</string>
<string name="renderer_none">Ingen</string>
<!-- Renderer Accuracy -->
<string name="renderer_accuracy_normal">Normal</string>
<string name="renderer_accuracy_high">Høy</string>
<!-- DMA Accuracy -->
<string name="dma_accuracy_default">Standard</string>
<!-- ASTC Decoding Method -->

View File

@@ -98,6 +98,10 @@
<string name="sample_shading_fraction_description">Intensywność przebiegu cieniowania próbki. Wyższe wartości poprawiają jakość, ale także w większym stopniu zmniejszają wydajność.</string>
<string name="veil_renderer">Renderer</string>
<string name="frame_interpolation">Zaawansowana synchronizacja klatek</string>
<string name="frame_interpolation_description">Zapewnia płynne i spójne wyświetlanie klatek poprzez synchronizację ich czasu, redukując zacinanie i nierówną animację. Idealne dla gier z niestabilnym czasem klatek lub mikro-zacinaniem podczas rozgrywki.</string>
<string name="renderer_early_release_fences">Wcześniejsze zwalnianie zabezpieczeń</string>
<string name="renderer_early_release_fences_description">Pomaga naprawić 0 FPS w grach takich jak DKCR:HD, Subnautica Below Zero i Ori 2, ale może zaburzyć ładowanie lub wydajność w grach Unreal Engine.</string>
<string name="sync_memory_operations">Synchronizuj operacje pamięci</string>
<string name="sync_memory_operations_description">Zapewnia spójność danych między operacjami obliczeniowymi i pamięciowymi. Ta opcja powinna naprawiać problemy w niektórych grach, ale może zmniejszyć wydajność w niektórych przypadkach. Gry z Unreal Engine 4 wydają się być najbardziej dotknięte.</string>
<string name="buffer_reorder_disable">Wyłącz przestawianie bufora</string>
@@ -464,6 +468,8 @@
<string name="display">Wyświetlacz</string>
<string name="processing">Postprocessing</string>
<string name="frame_skipping">WIP: Pomijanie klatek</string>
<string name="frame_skipping_description">Włącz lub wyłącz pomijanie klatek, aby poprawić wydajność poprzez zmniejszenie liczby renderowanych klatek. Ta funkcja jest wciąż w fazie rozwoju i zostanie włączona w przyszłych wersjach.</string>
<string name="renderer_accuracy">Poziom precyzji emulacji</string>
<string name="renderer_resolution">Rozdzielczość (Handheld/Zadokowany)</string>
<string name="renderer_vsync">Synchronizacja pionowa VSync</string>
@@ -874,6 +880,11 @@
<string name="renderer_vulkan">Vulkan</string>
<string name="renderer_none">Żadny</string>
<!-- Renderer Accuracy -->
<string name="renderer_accuracy_normal">Normalny</string>
<string name="renderer_accuracy_high">Wysoki</string>
<string name="renderer_accuracy_extreme">Ekstremalne</string>
<!-- DMA Accuracy -->
<string name="dma_accuracy_default">預設</string>
<string name="dma_accuracy_unsafe">Niezalecane</string>

View File

@@ -98,6 +98,10 @@
<string name="sample_shading_fraction_description">Fração de Sombreamento de Amostra: Define a intensidade do sample shading. Quanto maior, melhor a qualidade, mas maior o impacto no desempenho.</string>
<string name="veil_renderer">Renderizador</string>
<string name="frame_interpolation">Enhanced Frame Pacing</string>
<string name="frame_interpolation_description">Sincronização Melhorada de Quadros: Sincroniza o tempo entre os quadros para uma entrega mais uniforme, reduzindo travamentos e animações irregulares. Útil em jogos que sofrem com microtravamentos ou instabilidade na taxa de frames.</string>
<string name="renderer_early_release_fences">Release Fences Early</string>
<string name="renderer_early_release_fences_description">Liberar Cercas Antecipadamente: Ajuda a corrigir 0 FPS em jogos como DKCR:HD, Subnautica Below Zero e Ori 2, mas pode prejudicar o carregamento ou o desempenho em jogos feitos com Unreal Engine.</string>
<string name="sync_memory_operations">Sincronizar Operações de Memória</string>
<string name="sync_memory_operations_description">Garante a consistência de dados entre operações de computação e memória. Esta opção pode corrigir problemas em alguns jogos, mas também pode reduzir o desempenho, sendo os jogos da Unreal Engine 4 os mais afetados.</string>
<string name="buffer_reorder_disable">Desativar Reordenação de Buffers</string>
@@ -442,6 +446,8 @@
<string name="display">Tela</string>
<string name="processing">Pós-Processamento</string>
<string name="frame_skipping">WIP: Pular quadros</string>
<string name="frame_skipping_description">Ative ou desative o pulo de quadros para melhorar o desempenho reduzindo o número de quadros renderizados. Este recurso ainda está em desenvolvimento e será habilitado em versões futuras.</string>
<string name="renderer_accuracy">Nível de precisão</string>
<string name="renderer_resolution">Resolução (Portátil/Modo TV)</string>
<string name="renderer_vsync">Modo de VSync</string>
@@ -838,6 +844,11 @@ uma tentativa de mapeamento automático</string>
<string name="renderer_vulkan">Vulkan</string>
<string name="renderer_none">Nenhum</string>
<!-- Renderer Accuracy -->
<string name="renderer_accuracy_normal">Normal</string>
<string name="renderer_accuracy_high">Alta</string>
<string name="renderer_accuracy_extreme">Extrema</string>
<!-- DMA Accuracy -->
<string name="dma_accuracy_default">Padrão</string>
<string name="dma_accuracy_unsafe">Insegura</string>

View File

@@ -76,6 +76,10 @@
<string name="sample_shading_fraction_description">A intensidade da passagem de sombreamento de amostra. Valores mais elevados melhoram a qualidade, mas também reduzem o desempenho numa maior medida.</string>
<string name="veil_renderer">Renderizador</string>
<string name="frame_interpolation">Sincronização avançada de frames</string>
<string name="frame_interpolation_description">Garante uma entrega suave e consistente de frames sincronizando o seu tempo, reduzindo engasgadelas e animações irregulares. Ideal para jogos que experienciam instabilidade no tempo de frames ou micro-engasgadelas durante o jogo.</string>
<string name="renderer_early_release_fences">Libertar barreiras antecipadamente</string>
<string name="renderer_early_release_fences_description">Ajuda a corrigir 0 FPS em jogos como DKCR:HD, Subnautica Below Zero e Ori 2, mas pode afetar carregamento ou desempenho em jogos Unreal Engine.</string>
<string name="sync_memory_operations">Sincronizar Operações de Memória</string>
<string name="sync_memory_operations_description">Garante a consistência dos dados entre operações de computação e memória. Esta opção deve corrigir problemas em alguns jogos, mas pode reduzir o desempenho nalguns casos. Os jogos com Unreal Engine 4 parecem ser os mais afectados.</string>
<string name="buffer_reorder_disable">Desativar reordenação de buffer</string>
@@ -401,6 +405,8 @@
<string name="display">Ecrã</string>
<string name="processing">Pós-processamento</string>
<string name="frame_skipping">WIP: Saltar frames</string>
<string name="frame_skipping_description">Ative ou desative o salto de frames para melhorar o desempenho reduzindo o número de frames renderizados. Esta funcionalidade ainda está em desenvolvimento e será ativada em versões futuras.</string>
<string name="renderer_accuracy">Nível de precisão</string>
<string name="renderer_resolution">Resolução (Portátil/Ancorado)</string>
<string name="renderer_vsync">Modo VSync</string>
@@ -771,6 +777,9 @@ uma tentativa de mapeamento automático</string>
<string name="renderer_vulkan">Vulcano</string>
<string name="renderer_none">Nenhum</string>
<!-- Renderer Accuracy -->
<string name="renderer_accuracy_normal">Normal</string>
<string name="renderer_accuracy_high">Alto</string>
<!-- DMA Accuracy -->
<string name="dma_accuracy_default">Predefinido</string>
<!-- ASTC Decoding Method -->

View File

@@ -98,6 +98,10 @@
<string name="sample_shading_fraction_description">Интенсивность прохода сэмплового затенения. Более высокие значения улучшают качество, но и сильнее снижают производительность.</string>
<string name="veil_renderer">Рендеринг</string>
<string name="frame_interpolation">Улучшенная синхронизация кадров</string>
<string name="frame_interpolation_description">Обеспечивает плавную и стабильную подачу кадров за счет синхронизации их времени, уменьшая подтормаживания и неравномерную анимацию. Идеально для игр с нестабильным временем кадров или микро-подтормаживаниями во время игры.</string>
<string name="renderer_early_release_fences">Ранний релиз ограждений</string>
<string name="renderer_early_release_fences_description">Помогает исправить 0 FPS в играх типа DKCR:HD, Subnautica Below Zero и Ori 2, но может нарушить загрузку или производительность в играх на Unreal Engine.</string>
<string name="sync_memory_operations">Синхронизация операций с памятью</string>
<string name="sync_memory_operations_description">Обеспечивает согласованность данных между вычислительными операциями и операциями с памятью. Эта опция должна исправлять проблемы в некоторых играх, но может снижать производительность в некоторых случаях. Наиболее сильно это затрагивает игры на Unreal Engine 4.</string>
<string name="buffer_reorder_disable">Отключить переупорядочивание буфера</string>
@@ -462,6 +466,8 @@
<string name="display">Дисплей</string>
<string name="processing">Постобработка</string>
<string name="frame_skipping">В разработке: Пропуск кадров</string>
<string name="frame_skipping_description">Включите или отключите пропуск кадров для повышения производительности за счет уменьшения количества отображаемых кадров. Эта функция находится в разработке и будет включена в будущих версиях.</string>
<string name="renderer_accuracy">Уровень точности</string>
<string name="renderer_resolution">Разрешение (портативное/в док-станции)</string>
<string name="renderer_vsync">Режим верт. синхронизации</string>
@@ -872,6 +878,11 @@
<string name="renderer_vulkan">Vulkan</string>
<string name="renderer_none">Никакой</string>
<!-- Renderer Accuracy -->
<string name="renderer_accuracy_normal">Нормальная</string>
<string name="renderer_accuracy_high">Высокая</string>
<string name="renderer_accuracy_extreme">Экстрим</string>
<!-- DMA Accuracy -->
<string name="dma_accuracy_default">По умолчанию</string>
<string name="dma_accuracy_unsafe">Небезопасно</string>

View File

@@ -75,6 +75,10 @@
<string name="sample_shading_fraction_description">Интензитет проласка сенчења узорка. Веће вредности побољшавају квалитет више, али такође више смањују перформансе.</string>
<string name="veil_renderer">Рендерер</string>
<string name="frame_interpolation">Побољшани оквирни пејсинг</string>
<string name="frame_interpolation_description">Осигурава глатку и доследан испоруку оквира синхронизацијом времена између оквира, смањење муцања и неуједначене анимације. Идеално за игре које доживљавају временски оквир нестабилност или микро-штитнике током играња.</string>
<string name="renderer_early_release_fences">Ranije oslobađanje ograda</string>
<string name="renderer_early_release_fences_description">Pomaže u popravci 0 FPS u igrama kao što su DKCR:HD, Subnautica Below Zero i Ori 2, ali može oštetiti učitavanje ili performanse u Unreal Engine igrama.</string>
<string name="sync_memory_operations">Синхронизација меморијских операција</string>
<string name="sync_memory_operations_description">Осигурава конзистентност података између рачунских и меморијских операција. Ова опција би требало да поправи проблеме у неким играма, али може смањити перформансе у неким случајевима. Чини се да су игре са Unreal Engine 4 највише погођене.</string>
<string name="buffer_reorder_disable">Онемогући преуређивање бафера</string>
@@ -401,6 +405,8 @@
<string name="display">Приказ</string>
<string name="processing">Постпроцесирање</string>
<string name="frame_skipping">ВИП: Фрамескип</string>
<string name="frame_skipping_description">Пребацивање оквира прескакање да бисте побољшали перформансе смањењем броја пружених оквира. Ова функција се и даље ради и биће омогућена у будућим издањима.</string>
<string name="renderer_accuracy">Ниво тачности</string>
<string name="renderer_resolution">Резолуција (ручно / прикључено)</string>
<string name="renderer_vsync">Всинц мод</string>
@@ -769,6 +775,9 @@
<string name="renderer_vulkan">Вулкан</string>
<string name="renderer_none">Ниједан</string>
<!-- Renderer Accuracy -->
<string name="renderer_accuracy_normal">Нормалан</string>
<string name="renderer_accuracy_high">Високо</string>
<!-- DMA Accuracy -->
<string name="dma_accuracy_default">Подразумевано</string>
<!-- ASTC Decoding Method -->

View File

@@ -98,6 +98,10 @@
<string name="sample_shading_fraction_description">Інтенсивність проходу затінення зразка. Вищі значення покращують якість, але й сильніше знижують продуктивність.</string>
<string name="veil_renderer">Візуалізатор</string>
<string name="frame_interpolation">Покращена синхронізація кадрів</string>
<string name="frame_interpolation_description">Забезпечує плавну та стабільну подачу кадрів шляхом синхронізації їх часу, зменшуючи підвисання та нерівномірну анімацію. Ідеально для ігор з нестабільним часом кадрів або мікро-підвисаннями під час гри.</string>
<string name="renderer_early_release_fences">Release fences early</string>
<string name="renderer_early_release_fences_description">Це налаштування може бути необхідним для виправлення помилок 0FPS у деяких іграх (зокрема DKCR:HD, Subnautica та Ori 2). Водночас інші ігри, особливо створені на рушії Unreal Engine, можуть працювати некоректно або взагалі не запускатися.</string>
<string name="sync_memory_operations">Синхронізація операцій з пам\'яттю</string>
<string name="sync_memory_operations_description">Забезпечує узгодженість даних між обчислювальними операціями та операціями з пам\'яттю. Ця опція має виправляти проблеми в деяких іграх, але може знижувати продуктивність у деяких випадках. Ігри на Unreal Engine 4, здається, найбільш постраждалі.</string>
<string name="buffer_reorder_disable">Вимкнути переупорядкування буфера</string>
@@ -464,6 +468,8 @@
<string name="display">Дисплей</string>
<string name="processing">Постобробка</string>
<string name="frame_skipping">В розробці: Пропуск кадрів</string>
<string name="frame_skipping_description">Увімкніть або вимкніть пропуск кадрів для покращення продуктивності за рахунок зменшення кількості візуалізованих кадрів. Ця функція ще розробляється та буде доступна у майбутніх версіях.</string>
<string name="renderer_accuracy">Рівень точності</string>
<string name="renderer_resolution">Роздільна здатність (Портативний/Док)</string>
<string name="renderer_vsync">Режим верт. синхронізації</string>
@@ -874,6 +880,11 @@
<string name="renderer_vulkan">Vulkan</string>
<string name="renderer_none">Вимкнено</string>
<!-- Renderer Accuracy -->
<string name="renderer_accuracy_normal">Нормальна</string>
<string name="renderer_accuracy_high">Висока</string>
<string name="renderer_accuracy_extreme">Екстремально</string>
<!-- DMA Accuracy -->
<string name="dma_accuracy_default">Типово</string>
<string name="dma_accuracy_unsafe">Небезпечно</string>

View File

@@ -76,6 +76,10 @@
<string name="sample_shading_fraction_description">Cường độ của bước tô bóng mẫu. Giá trị cao hơn cải thiện chất lượng tốt hơn nhưng cũng giảm hiệu suất nhiều hơn.</string>
<string name="veil_renderer">Trình kết xuất</string>
<string name="frame_interpolation">Đồng bộ khung hình nâng cao</string>
<string name="frame_interpolation_description">Đảm bảo cung cấp khung hình mượt mà và ổn định bằng cách đồng bộ hóa thời gian giữa các khung hình, giảm giật lag và hoạt ảnh không đồng đều. Lý tưởng cho các trò chơi gặp vấn đề về thời gian khung hình không ổn định hoặc giật lag nhẹ trong khi chơi.</string>
<string name="renderer_early_release_fences">Giải phóng rào chắn sớm</string>
<string name="renderer_early_release_fences_description">Giúp sửa lỗi 0 FPS trong các trò chơi như DKCR:HD, Subnautica Below Zero và Ori 2, nhưng có thể ảnh hưởng đến tải hoặc hiệu suất trong trò chơi Unreal Engine.</string>
<string name="sync_memory_operations">Đồng bộ hoá thao tác bộ nhớ</string>
<string name="sync_memory_operations_description">Đảm bảo tính nhất quán dữ liệu giữa các thao tác tính toán và bộ nhớ. Tùy chọn này nên khắc phục sự cố trong một số trò chơi, nhưng có thể làm giảm hiệu suất trong một số trường hợp. Các trò chơi với Unreal Engine 4 có vẻ bị ảnh hưởng nhiều nhất.</string>
<string name="buffer_reorder_disable">Tắt sắp xếp lại bộ đệm</string>
@@ -376,6 +380,8 @@
<string name="display">Hiển thị</string>
<string name="processing">Hậu xử lý</string>
<string name="frame_skipping">WIP: Bỏ qua khung hình</string>
<string name="frame_skipping_description">Bật hoặc tắt bỏ qua khung hình để cải thiện hiệu suất bằng cách giảm số lượng khung hình được kết xuất. Tính năng này đang được phát triển và sẽ được kích hoạt trong các bản phát hành tương lai.</string>
<string name="renderer_accuracy">Mức độ chính xác</string>
<string name="renderer_resolution">Độ phân giải (Handheld/Docked)</string>
<string name="renderer_vsync">Chế độ VSync</string>
@@ -636,6 +642,9 @@
<string name="renderer_vulkan">Vulkan</string>
<string name="renderer_none">Trống</string>
<!-- Renderer Accuracy -->
<string name="renderer_accuracy_normal">Trung bình</string>
<string name="renderer_accuracy_high">Khỏe</string>
<!-- DMA Accuracy -->
<string name="dma_accuracy_default">Mặc định</string>
<!-- ASTC Decoding Method -->

View File

@@ -98,6 +98,10 @@
<string name="sample_shading_fraction_description">采样着色处理的强度。值越高,质量改善越多,但性能降低也越明显。</string>
<string name="veil_renderer">渲染器</string>
<string name="frame_interpolation">增强帧同步</string>
<string name="frame_interpolation_description">通过同步帧间时间确保流畅一致的帧交付,减少卡顿和不均匀动画。适合存在帧时间不稳定或游戏过程中出现微卡顿的游戏。</string>
<string name="renderer_early_release_fences">提前释放围栏</string>
<string name="renderer_early_release_fences_description">可修复《大金刚国度:热带寒流》《深海迷航:零度之下》和《奥日2》等游戏中的0 FPS问题但可能影响Unreal Engine游戏的加载或性能。</string>
<string name="sync_memory_operations">同步内存操作</string>
<string name="sync_memory_operations_description">确保计算和内存操作之间的数据一致性。 此选项应能修复某些游戏中的问题,但在某些情况下可能会降低性能。 使用Unreal Engine 4的游戏似乎受影响最大。</string>
<string name="buffer_reorder_disable">禁用缓冲重排序</string>
@@ -439,6 +443,8 @@
<string name="display">显示</string>
<string name="processing">后处理</string>
<string name="frame_skipping">开发中:跳帧</string>
<string name="frame_skipping_description">启用或禁用跳帧以减少渲染帧数,提高性能。此功能仍在开发中,将在未来版本中启用。</string>
<string name="renderer_accuracy">精度等级</string>
<string name="renderer_resolution">分辨率 (掌机模式/主机模式)</string>
<string name="renderer_vsync">垂直同步模式</string>
@@ -846,6 +852,11 @@
<string name="renderer_vulkan">Vulkan</string>
<string name="renderer_none"></string>
<!-- Renderer Accuracy -->
<string name="renderer_accuracy_normal">正常</string>
<string name="renderer_accuracy_high"></string>
<string name="renderer_accuracy_extreme">极致</string>
<!-- DMA Accuracy -->
<string name="dma_accuracy_default">默认</string>
<string name="dma_accuracy_unsafe">不安全</string>

View File

@@ -98,6 +98,10 @@
<string name="sample_shading_fraction_description">採樣著色處理的強度。數值越高,品質改善越多,但效能降低也越明顯。</string>
<string name="veil_renderer">渲染器</string>
<string name="frame_interpolation">增強幀同步</string>
<string name="frame_interpolation_description">通過同步幀間時間確保幀傳輸流暢一致,減少卡頓和不均勻動畫。適合存在幀時間不穩定或遊戲過程中出現些微卡頓的遊戲。</string>
<string name="renderer_early_release_fences">提前釋放圍欄</string>
<string name="renderer_early_release_fences_description">可修復《咚奇剛歸來HD》、《深海迷航冰點之下》和《聖靈之光2》等遊戲中的0 FPS問題但可能影響Unreal Engine遊戲的載入或效能。</string>
<string name="sync_memory_operations">同步記憶體操作</string>
<string name="sync_memory_operations_description">確保計算和記憶體操作之間的資料一致性。 此選項應能修復某些遊戲中的問題,但在某些情況下可能會降低效能。 使用Unreal Engine 4的遊戲似乎受影響最大。</string>
<string name="buffer_reorder_disable">停用緩衝區重新排序</string>
@@ -439,6 +443,8 @@
<string name="display">顯示</string>
<string name="processing">後處理</string>
<string name="frame_skipping">開發中:跳幀</string>
<string name="frame_skipping_description">啟用或停用跳幀以減少渲染幀數,提高效能。此功能仍在開發中,將在未來版本中啟用。</string>
<string name="renderer_accuracy">準確度層級</string>
<string name="renderer_resolution">解析度 (手提/底座)</string>
<string name="renderer_vsync">垂直同步</string>
@@ -846,6 +852,11 @@
<string name="renderer_vulkan">Vulkan</string>
<string name="renderer_none"></string>
<!-- Renderer Accuracy -->
<string name="renderer_accuracy_normal">標準</string>
<string name="renderer_accuracy_high"></string>
<string name="renderer_accuracy_extreme">極高</string>
<!-- DMA Accuracy -->
<string name="dma_accuracy_default">預設</string>
<string name="dma_accuracy_unsafe">不安全</string>

View File

@@ -98,9 +98,9 @@
</integer-array>
<string-array name="rendererAccuracyNames">
<item>@string/renderer_accuracy_low</item>
<item>@string/renderer_accuracy_medium</item>
<item>@string/renderer_accuracy_normal</item>
<item>@string/renderer_accuracy_high</item>
<item>@string/renderer_accuracy_extreme</item>
</string-array>
<integer-array name="rendererAccuracyValues">

View File

@@ -105,6 +105,10 @@
<string name="sample_shading_fraction_description">The intensity of the sample shading pass. Higher values improve quality more but also reduce performance to a greater extent.</string>
<string name="veil_renderer">Renderer</string>
<string name="frame_interpolation">Enhanced Frame Pacing</string>
<string name="frame_interpolation_description">Ensures smooth and consistent frame delivery by synchronizing the timing between frames, reducing stuttering and uneven animation. Ideal for games that experience frame timing instability or micro-stutters during gameplay.</string>
<string name="renderer_early_release_fences">Release Fences Early</string>
<string name="renderer_early_release_fences_description">Helps fix 0 FPS in games like DKCR:HD, Subnautica Below Zero and Ori 2, but may break loading or performance in Unreal Engine games.</string>
<string name="sync_memory_operations">Sync Memory Operations</string>
<string name="sync_memory_operations_description">Ensures data consistency between compute and memory operations. This option should fix issues in some games, but may also reduce performance in some cases. Unreal Engine 4 games often see the most significant changes thereof.</string>
<string name="buffer_reorder_disable">Disable Buffer Reorder</string>
@@ -485,6 +489,8 @@
<string name="display">Display</string>
<string name="processing">Post-Processing</string>
<string name="frame_skipping">WIP: Frameskip</string>
<string name="frame_skipping_description">Toggle frame skipping to improve performance by reducing the number of rendered frames. This feature is still being worked on and will be enabled in future releases.</string>
<string name="renderer_accuracy">Accuracy level</string>
<string name="renderer_resolution">Resolution (Handheld/Docked)</string>
<string name="renderer_vsync">VSync mode</string>
@@ -922,9 +928,9 @@
<string name="renderer_none">None</string>
<!-- Renderer Accuracy -->
<string name="renderer_accuracy_low">Fast</string>
<string name="renderer_accuracy_medium">Balanced</string>
<string name="renderer_accuracy_high">Accurate</string>
<string name="renderer_accuracy_normal">Normal</string>
<string name="renderer_accuracy_high">High</string>
<string name="renderer_accuracy_extreme">Extreme</string>
<!-- DMA Accuracy -->
<string name="dma_accuracy_default">Default</string>

View File

@@ -181,6 +181,12 @@ if(ANDROID)
android/applets/software_keyboard.h)
endif()
if(LINUX AND NOT APPLE)
target_sources(common PRIVATE linux/gamemode.cpp linux/gamemode.h)
target_link_libraries(common PRIVATE gamemode::headers)
endif()
if(ARCHITECTURE_x86_64)
target_sources(
common

View File

@@ -0,0 +1,40 @@
// SPDX-FileCopyrightText: Copyright 2023 yuzu Emulator Project
// SPDX-License-Identifier: GPL-2.0-or-later
#include <gamemode_client.h>
#include "common/linux/gamemode.h"
#include "common/logging/log.h"
#include "common/settings.h"
namespace Common::Linux {
void StartGamemode() {
if (Settings::values.enable_gamemode) {
if (gamemode_request_start() < 0) {
LOG_WARNING(Frontend, "Failed to start gamemode: {}", gamemode_error_string());
} else {
LOG_INFO(Frontend, "Started gamemode");
}
}
}
void StopGamemode() {
if (Settings::values.enable_gamemode) {
if (gamemode_request_end() < 0) {
LOG_WARNING(Frontend, "Failed to stop gamemode: {}", gamemode_error_string());
} else {
LOG_INFO(Frontend, "Stopped gamemode");
}
}
}
void SetGamemodeState(bool state) {
if (state) {
StartGamemode();
} else {
StopGamemode();
}
}
} // namespace Common::Linux

View File

@@ -0,0 +1,24 @@
// SPDX-FileCopyrightText: Copyright 2023 yuzu Emulator Project
// SPDX-License-Identifier: GPL-2.0-or-later
#pragma once
namespace Common::Linux {
/**
* Start the (Feral Interactive) Linux gamemode if it is installed and it is activated
*/
void StartGamemode();
/**
* Stop the (Feral Interactive) Linux gamemode if it is installed and it is activated
*/
void StopGamemode();
/**
* Start or stop the (Feral Interactive) Linux gamemode if it is installed and it is activated
* @param state The new state the gamemode should have
*/
void SetGamemodeState(bool state);
} // namespace Common::Linux

View File

@@ -149,16 +149,13 @@ void UpdateGPUAccuracy() {
values.current_gpu_accuracy = values.gpu_accuracy.GetValue();
}
bool IsGPULevelLow() {
return values.current_gpu_accuracy == GpuAccuracy::Low;
}
bool IsGPULevelMedium() {
return values.current_gpu_accuracy == GpuAccuracy::Medium;
bool IsGPULevelExtreme() {
return values.current_gpu_accuracy == GpuAccuracy::Extreme;
}
bool IsGPULevelHigh() {
return values.current_gpu_accuracy == GpuAccuracy::High;
return values.current_gpu_accuracy == GpuAccuracy::Extreme ||
values.current_gpu_accuracy == GpuAccuracy::High;
}
bool IsDMALevelDefault() {
@@ -274,6 +271,8 @@ const char* TranslateCategory(Category category) {
return "Services";
case Category::Paths:
return "Paths";
case Category::Linux:
return "Linux";
case Category::MaxEnum:
break;
}

View File

@@ -333,7 +333,12 @@ struct Values {
"shader_backend", Category::Renderer, Specialization::RuntimeList};
SwitchableSetting<int> vulkan_device{linkage, 0, "vulkan_device", Category::Renderer,
Specialization::RuntimeList};
#ifdef __ANDROID__
SwitchableSetting<bool> frame_interpolation{linkage, true, "frame_interpolation", Category::Renderer,
Specialization::RuntimeList};
SwitchableSetting<bool> frame_skipping{linkage, false, "frame_skipping", Category::Renderer,
Specialization::RuntimeList};
#endif
SwitchableSetting<bool> use_disk_shader_cache{linkage, true, "use_disk_shader_cache",
Category::Renderer};
SwitchableSetting<SpirvOptimizeMode, true> optimize_spirv_output{linkage,
@@ -414,9 +419,9 @@ struct Values {
SwitchableSetting<GpuAccuracy, true> gpu_accuracy{linkage,
#ifdef ANDROID
GpuAccuracy::Low,
GpuAccuracy::Normal,
#else
GpuAccuracy::Medium,
GpuAccuracy::High,
#endif
"gpu_accuracy",
Category::RendererAdvanced,
@@ -424,7 +429,7 @@ struct Values {
true,
true};
GpuAccuracy current_gpu_accuracy{GpuAccuracy::Medium};
GpuAccuracy current_gpu_accuracy{GpuAccuracy::High};
SwitchableSetting<DmaAccuracy, true> dma_accuracy{linkage,
DmaAccuracy::Default,
@@ -457,6 +462,15 @@ struct Values {
Specialization::Default,
true,
true};
#ifdef ANDROID
SwitchableSetting<bool> early_release_fences{linkage,
false,
"early_release_fences",
Category::RendererAdvanced,
Specialization::Default,
true,
true};
#endif
SwitchableSetting<bool> sync_memory_operations{linkage,
false,
"sync_memory_operations",
@@ -607,6 +621,13 @@ struct Values {
true,
true};
// Linux
SwitchableSetting<bool> enable_gamemode{linkage, true, "enable_gamemode", Category::Linux};
#ifdef __unix__
SwitchableSetting<bool> gui_force_x11{linkage, false, "gui_force_x11", Category::Linux};
Setting<bool> gui_hide_backend_warning{linkage, false, "gui_hide_backend_warning", Category::Linux};
#endif
// Controls
InputSetting<std::array<PlayerInput, 10>> players;
@@ -762,8 +783,7 @@ extern Values values;
bool getDebugKnobAt(u8 i);
void UpdateGPUAccuracy();
bool IsGPULevelLow();
bool IsGPULevelMedium();
bool IsGPULevelExtreme();
bool IsGPULevelHigh();
bool IsDMALevelDefault();

View File

@@ -1,6 +1,3 @@
// SPDX-FileCopyrightText: Copyright 2025 Eden Emulator Project
// SPDX-License-Identifier: GPL-3.0-or-later
// SPDX-FileCopyrightText: Copyright 2023 yuzu Emulator Project
// SPDX-License-Identifier: GPL-2.0-or-later
@@ -47,6 +44,7 @@ enum class Category : u32 {
Multiplayer,
Services,
Paths,
Linux,
LibraryApplet,
MaxEnum,
};

View File

@@ -133,7 +133,7 @@ ENUM(VSyncMode, Immediate, Mailbox, Fifo, FifoRelaxed);
ENUM(VramUsageMode, Conservative, Aggressive);
ENUM(RendererBackend, OpenGL, Vulkan, Null);
ENUM(ShaderBackend, Glsl, Glasm, SpirV);
ENUM(GpuAccuracy, Low, Medium, High);
ENUM(GpuAccuracy, Normal, High, Extreme);
ENUM(DmaAccuracy, Default, Unsafe, Safe);
ENUM(CpuBackend, Dynarmic, Nce);
ENUM(CpuAccuracy, Auto, Accurate, Unsafe, Paranoid, Debugging);

View File

@@ -77,9 +77,9 @@ void EmitX64::EmitPushRSB(IR::Block&, IR::Inst* inst) {
ASSERT(inst->GetArg(0).IsImmediate());
u64 imm64 = inst->GetArg(0).GetU64();
Xbyak::Reg64 code_ptr_reg = reg_alloc.ScratchGpr(code, {HostLoc::RCX});
Xbyak::Reg64 loc_desc_reg = reg_alloc.ScratchGpr(code);
Xbyak::Reg32 index_reg = reg_alloc.ScratchGpr(code).cvt32();
Xbyak::Reg64 code_ptr_reg = reg_alloc.ScratchGpr({HostLoc::RCX});
Xbyak::Reg64 loc_desc_reg = reg_alloc.ScratchGpr();
Xbyak::Reg32 index_reg = reg_alloc.ScratchGpr().cvt32();
u64 code_ptr = unique_hash_to_code_ptr.find(imm64) != unique_hash_to_code_ptr.end()
? u64(unique_hash_to_code_ptr[imm64])
: u64(code->GetReturnFromRunCodeAddress());

View File

@@ -175,6 +175,7 @@ if ("x86_64" IN_LIST ARCHITECTURE)
backend/x64/exclusive_monitor.cpp
backend/x64/exclusive_monitor_friend.h
backend/x64/host_feature.h
backend/x64/hostloc.cpp
backend/x64/hostloc.h
backend/x64/jitstate_info.h
backend/x64/oparg.h

View File

@@ -1,6 +1,3 @@
// SPDX-FileCopyrightText: Copyright 2025 Eden Emulator Project
// SPDX-License-Identifier: GPL-3.0-or-later
/* This file is part of the dynarmic project.
* Copyright (c) 2022 MerryMage
* SPDX-License-Identifier: 0BSD
@@ -63,7 +60,7 @@ void EmitIR<IR::Opcode::Pack2x32To1x64>(oaknut::CodeGenerator& code, EmitContext
template<>
void EmitIR<IR::Opcode::Pack2x64To1x128>(oaknut::CodeGenerator& code, EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
bool const args_in_gpr[] = { args[0].IsInGpr(ctx.reg_alloc), args[1].IsInGpr(ctx.reg_alloc) };
bool const args_in_gpr[] = { args[0].IsInGpr(), args[1].IsInGpr() };
if (args_in_gpr[0] && args_in_gpr[1]) {
auto Xlo = ctx.reg_alloc.ReadX(args[0]);
auto Xhi = ctx.reg_alloc.ReadX(args[1]);

View File

@@ -84,7 +84,7 @@ IR::AccType Argument::GetImmediateAccType() const {
return value.GetAccType();
}
HostLoc::Kind Argument::CurrentLocationKind(RegAlloc& reg_alloc) const {
HostLoc::Kind Argument::CurrentLocationKind() const {
return reg_alloc.ValueLocation(value.GetInst())->kind;
}
@@ -131,7 +131,7 @@ void HostLocInfo::UpdateUses() {
}
RegAlloc::ArgumentInfo RegAlloc::GetArgumentInfo(IR::Inst* inst) {
ArgumentInfo ret = {Argument{}, Argument{}, Argument{}, Argument{}};
ArgumentInfo ret = {Argument{*this}, Argument{*this}, Argument{*this}, Argument{*this}};
for (size_t i = 0; i < inst->NumArgs(); i++) {
const IR::Value arg = inst->GetArg(i);
ret[i].value = arg;

View File

@@ -64,18 +64,18 @@ public:
IR::AccType GetImmediateAccType() const;
// Only valid if not immediate
HostLoc::Kind CurrentLocationKind(RegAlloc& reg_alloc) const;
bool IsInGpr(RegAlloc& reg_alloc) const {
return !IsImmediate() && CurrentLocationKind(reg_alloc) == HostLoc::Kind::Gpr;
}
bool IsInFpr(RegAlloc& reg_alloc) const {
return !IsImmediate() && CurrentLocationKind(reg_alloc) == HostLoc::Kind::Fpr;
}
HostLoc::Kind CurrentLocationKind() const;
bool IsInGpr() const { return !IsImmediate() && CurrentLocationKind() == HostLoc::Kind::Gpr; }
bool IsInFpr() const { return !IsImmediate() && CurrentLocationKind() == HostLoc::Kind::Fpr; }
private:
friend class RegAlloc;
IR::Value value;
explicit Argument(RegAlloc& reg_alloc)
: reg_alloc{reg_alloc} {}
bool allocated = false;
RegAlloc& reg_alloc;
IR::Value value;
};
struct FlagsTag final {

View File

@@ -117,7 +117,7 @@ A32EmitX64::BlockDescriptor A32EmitX64::Emit(IR::Block& block) {
return gprs;
}();
new (&this->reg_alloc) RegAlloc(gpr_order, any_xmm);
new (&this->reg_alloc) RegAlloc(&code, gpr_order, any_xmm);
A32EmitContext ctx{conf, reg_alloc, block};
// Start emitting.
@@ -283,47 +283,47 @@ void A32EmitX64::GenTerminalHandlers() {
void A32EmitX64::EmitA32SetCheckBit(A32EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Reg8 to_store = ctx.reg_alloc.UseGpr(code, args[0]).cvt8();
const Xbyak::Reg8 to_store = ctx.reg_alloc.UseGpr(args[0]).cvt8();
code.mov(code.byte[rsp + ABI_SHADOW_SPACE + offsetof(StackLayout, check_bit)], to_store);
}
void A32EmitX64::EmitA32GetRegister(A32EmitContext& ctx, IR::Inst* inst) {
const A32::Reg reg = inst->GetArg(0).GetA32RegRef();
const Xbyak::Reg32 result = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 result = ctx.reg_alloc.ScratchGpr().cvt32();
code.mov(result, MJitStateReg(reg));
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
void A32EmitX64::EmitA32GetExtendedRegister32(A32EmitContext& ctx, IR::Inst* inst) {
const A32::ExtReg reg = inst->GetArg(0).GetA32ExtRegRef();
ASSERT(A32::IsSingleExtReg(reg));
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm();
code.movss(result, MJitStateExtReg(reg));
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
void A32EmitX64::EmitA32GetExtendedRegister64(A32EmitContext& ctx, IR::Inst* inst) {
const A32::ExtReg reg = inst->GetArg(0).GetA32ExtRegRef();
ASSERT(A32::IsDoubleExtReg(reg));
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm();
code.movsd(result, MJitStateExtReg(reg));
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
void A32EmitX64::EmitA32GetVector(A32EmitContext& ctx, IR::Inst* inst) {
const A32::ExtReg reg = inst->GetArg(0).GetA32ExtRegRef();
ASSERT(A32::IsDoubleExtReg(reg) || A32::IsQuadExtReg(reg));
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm();
if (A32::IsDoubleExtReg(reg)) {
code.movsd(result, MJitStateExtReg(reg));
} else {
code.movaps(result, MJitStateExtReg(reg));
}
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
void A32EmitX64::EmitA32SetRegister(A32EmitContext& ctx, IR::Inst* inst) {
@@ -332,11 +332,11 @@ void A32EmitX64::EmitA32SetRegister(A32EmitContext& ctx, IR::Inst* inst) {
if (args[1].IsImmediate()) {
code.mov(MJitStateReg(reg), args[1].GetImmediateU32());
} else if (args[1].IsInXmm(ctx.reg_alloc)) {
const Xbyak::Xmm to_store = ctx.reg_alloc.UseXmm(code, args[1]);
} else if (args[1].IsInXmm()) {
const Xbyak::Xmm to_store = ctx.reg_alloc.UseXmm(args[1]);
code.movd(MJitStateReg(reg), to_store);
} else {
const Xbyak::Reg32 to_store = ctx.reg_alloc.UseGpr(code, args[1]).cvt32();
const Xbyak::Reg32 to_store = ctx.reg_alloc.UseGpr(args[1]).cvt32();
code.mov(MJitStateReg(reg), to_store);
}
}
@@ -346,11 +346,11 @@ void A32EmitX64::EmitA32SetExtendedRegister32(A32EmitContext& ctx, IR::Inst* ins
const A32::ExtReg reg = inst->GetArg(0).GetA32ExtRegRef();
ASSERT(A32::IsSingleExtReg(reg));
if (args[1].IsInXmm(ctx.reg_alloc)) {
Xbyak::Xmm to_store = ctx.reg_alloc.UseXmm(code, args[1]);
if (args[1].IsInXmm()) {
Xbyak::Xmm to_store = ctx.reg_alloc.UseXmm(args[1]);
code.movss(MJitStateExtReg(reg), to_store);
} else {
Xbyak::Reg32 to_store = ctx.reg_alloc.UseGpr(code, args[1]).cvt32();
Xbyak::Reg32 to_store = ctx.reg_alloc.UseGpr(args[1]).cvt32();
code.mov(MJitStateExtReg(reg), to_store);
}
}
@@ -360,11 +360,11 @@ void A32EmitX64::EmitA32SetExtendedRegister64(A32EmitContext& ctx, IR::Inst* ins
const A32::ExtReg reg = inst->GetArg(0).GetA32ExtRegRef();
ASSERT(A32::IsDoubleExtReg(reg));
if (args[1].IsInXmm(ctx.reg_alloc)) {
const Xbyak::Xmm to_store = ctx.reg_alloc.UseXmm(code, args[1]);
if (args[1].IsInXmm()) {
const Xbyak::Xmm to_store = ctx.reg_alloc.UseXmm(args[1]);
code.movsd(MJitStateExtReg(reg), to_store);
} else {
const Xbyak::Reg64 to_store = ctx.reg_alloc.UseGpr(code, args[1]);
const Xbyak::Reg64 to_store = ctx.reg_alloc.UseGpr(args[1]);
code.mov(MJitStateExtReg(reg), to_store);
}
}
@@ -374,7 +374,7 @@ void A32EmitX64::EmitA32SetVector(A32EmitContext& ctx, IR::Inst* inst) {
const A32::ExtReg reg = inst->GetArg(0).GetA32ExtRegRef();
ASSERT(A32::IsDoubleExtReg(reg) || A32::IsQuadExtReg(reg));
const Xbyak::Xmm to_store = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm to_store = ctx.reg_alloc.UseXmm(args[1]);
if (A32::IsDoubleExtReg(reg)) {
code.movsd(MJitStateExtReg(reg), to_store);
} else {
@@ -383,9 +383,9 @@ void A32EmitX64::EmitA32SetVector(A32EmitContext& ctx, IR::Inst* inst) {
}
void A32EmitX64::EmitA32GetCpsr(A32EmitContext& ctx, IR::Inst* inst) {
const Xbyak::Reg32 result = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 tmp = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 tmp2 = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 result = ctx.reg_alloc.ScratchGpr().cvt32();
const Xbyak::Reg32 tmp = ctx.reg_alloc.ScratchGpr().cvt32();
const Xbyak::Reg32 tmp2 = ctx.reg_alloc.ScratchGpr().cvt32();
if (code.HasHostFeature(HostFeature::FastBMI2)) {
// Here we observe that cpsr_et and cpsr_ge are right next to each other in memory,
@@ -428,15 +428,15 @@ void A32EmitX64::EmitA32GetCpsr(A32EmitContext& ctx, IR::Inst* inst) {
code.or_(result, dword[code.ABI_JIT_PTR + offsetof(A32JitState, cpsr_jaifm)]);
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
void A32EmitX64::EmitA32SetCpsr(A32EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Reg32 cpsr = ctx.reg_alloc.UseScratchGpr(code, args[0]).cvt32();
const Xbyak::Reg32 tmp = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 tmp2 = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 cpsr = ctx.reg_alloc.UseScratchGpr(args[0]).cvt32();
const Xbyak::Reg32 tmp = ctx.reg_alloc.ScratchGpr().cvt32();
const Xbyak::Reg32 tmp2 = ctx.reg_alloc.ScratchGpr().cvt32();
if (conf.always_little_endian) {
code.and_(cpsr, 0xFFFFFDFF);
@@ -501,7 +501,7 @@ void A32EmitX64::EmitA32SetCpsr(A32EmitContext& ctx, IR::Inst* inst) {
void A32EmitX64::EmitA32SetCpsrNZCV(A32EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Reg32 to_store = ctx.reg_alloc.UseScratchGpr(code, args[0]).cvt32();
const Xbyak::Reg32 to_store = ctx.reg_alloc.UseScratchGpr(args[0]).cvt32();
code.mov(dword[code.ABI_JIT_PTR + offsetof(A32JitState, cpsr_nzcv)], to_store);
}
@@ -512,15 +512,15 @@ void A32EmitX64::EmitA32SetCpsrNZCVRaw(A32EmitContext& ctx, IR::Inst* inst) {
code.mov(dword[code.ABI_JIT_PTR + offsetof(A32JitState, cpsr_nzcv)], NZCV::ToX64(imm));
} else if (code.HasHostFeature(HostFeature::FastBMI2)) {
const Xbyak::Reg32 a = ctx.reg_alloc.UseScratchGpr(code, args[0]).cvt32();
const Xbyak::Reg32 b = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 a = ctx.reg_alloc.UseScratchGpr(args[0]).cvt32();
const Xbyak::Reg32 b = ctx.reg_alloc.ScratchGpr().cvt32();
code.shr(a, 28);
code.mov(b, NZCV::x64_mask);
code.pdep(a, a, b);
code.mov(dword[code.ABI_JIT_PTR + offsetof(A32JitState, cpsr_nzcv)], a);
} else {
const Xbyak::Reg32 a = ctx.reg_alloc.UseScratchGpr(code, args[0]).cvt32();
const Xbyak::Reg32 a = ctx.reg_alloc.UseScratchGpr(args[0]).cvt32();
code.shr(a, 28);
code.imul(a, a, NZCV::to_x64_multiplier);
@@ -537,8 +537,8 @@ void A32EmitX64::EmitA32SetCpsrNZCVQ(A32EmitContext& ctx, IR::Inst* inst) {
code.mov(dword[code.ABI_JIT_PTR + offsetof(A32JitState, cpsr_nzcv)], NZCV::ToX64(imm));
code.mov(code.byte[code.ABI_JIT_PTR + offsetof(A32JitState, cpsr_q)], u8((imm & 0x08000000) != 0 ? 1 : 0));
} else if (code.HasHostFeature(HostFeature::FastBMI2)) {
const Xbyak::Reg32 a = ctx.reg_alloc.UseScratchGpr(code, args[0]).cvt32();
const Xbyak::Reg32 b = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 a = ctx.reg_alloc.UseScratchGpr(args[0]).cvt32();
const Xbyak::Reg32 b = ctx.reg_alloc.ScratchGpr().cvt32();
code.shr(a, 28);
code.setc(code.byte[code.ABI_JIT_PTR + offsetof(A32JitState, cpsr_q)]);
@@ -546,7 +546,7 @@ void A32EmitX64::EmitA32SetCpsrNZCVQ(A32EmitContext& ctx, IR::Inst* inst) {
code.pdep(a, a, b);
code.mov(dword[code.ABI_JIT_PTR + offsetof(A32JitState, cpsr_nzcv)], a);
} else {
const Xbyak::Reg32 a = ctx.reg_alloc.UseScratchGpr(code, args[0]).cvt32();
const Xbyak::Reg32 a = ctx.reg_alloc.UseScratchGpr(args[0]).cvt32();
code.shr(a, 28);
code.setc(code.byte[code.ABI_JIT_PTR + offsetof(A32JitState, cpsr_q)]);
@@ -559,8 +559,8 @@ void A32EmitX64::EmitA32SetCpsrNZCVQ(A32EmitContext& ctx, IR::Inst* inst) {
void A32EmitX64::EmitA32SetCpsrNZ(A32EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Reg32 nz = ctx.reg_alloc.UseGpr(code, args[0]).cvt32();
const Xbyak::Reg32 tmp = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 nz = ctx.reg_alloc.UseGpr(args[0]).cvt32();
const Xbyak::Reg32 tmp = ctx.reg_alloc.ScratchGpr().cvt32();
code.movzx(tmp, code.byte[code.ABI_JIT_PTR + offsetof(A32JitState, cpsr_nzcv) + 1]);
code.and_(tmp, 1);
@@ -577,12 +577,12 @@ void A32EmitX64::EmitA32SetCpsrNZC(A32EmitContext& ctx, IR::Inst* inst) {
code.mov(code.byte[code.ABI_JIT_PTR + offsetof(A32JitState, cpsr_nzcv) + 1], c);
} else {
const Xbyak::Reg8 c = ctx.reg_alloc.UseGpr(code, args[1]).cvt8();
const Xbyak::Reg8 c = ctx.reg_alloc.UseGpr(args[1]).cvt8();
code.mov(code.byte[code.ABI_JIT_PTR + offsetof(A32JitState, cpsr_nzcv) + 1], c);
}
} else {
const Xbyak::Reg32 nz = ctx.reg_alloc.UseScratchGpr(code, args[0]).cvt32();
const Xbyak::Reg32 nz = ctx.reg_alloc.UseScratchGpr(args[0]).cvt32();
if (args[1].IsImmediate()) {
const bool c = args[1].GetImmediateU1();
@@ -590,7 +590,7 @@ void A32EmitX64::EmitA32SetCpsrNZC(A32EmitContext& ctx, IR::Inst* inst) {
code.or_(nz, c);
code.mov(code.byte[code.ABI_JIT_PTR + offsetof(A32JitState, cpsr_nzcv) + 1], nz.cvt8());
} else {
const Xbyak::Reg32 c = ctx.reg_alloc.UseGpr(code, args[1]).cvt32();
const Xbyak::Reg32 c = ctx.reg_alloc.UseGpr(args[1]).cvt32();
code.or_(nz, c);
code.mov(code.byte[code.ABI_JIT_PTR + offsetof(A32JitState, cpsr_nzcv) + 1], nz.cvt8());
@@ -599,13 +599,13 @@ void A32EmitX64::EmitA32SetCpsrNZC(A32EmitContext& ctx, IR::Inst* inst) {
}
static void EmitGetFlag(BlockOfCode& code, A32EmitContext& ctx, IR::Inst* inst, size_t flag_bit) {
const Xbyak::Reg32 result = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 result = ctx.reg_alloc.ScratchGpr().cvt32();
code.mov(result, dword[code.ABI_JIT_PTR + offsetof(A32JitState, cpsr_nzcv)]);
if (flag_bit != 0) {
code.shr(result, static_cast<int>(flag_bit));
}
code.and_(result, 1);
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
void A32EmitX64::EmitA32GetCFlag(A32EmitContext& ctx, IR::Inst* inst) {
@@ -619,27 +619,27 @@ void A32EmitX64::EmitA32OrQFlag(A32EmitContext& ctx, IR::Inst* inst) {
code.mov(dword[code.ABI_JIT_PTR + offsetof(A32JitState, cpsr_q)], 1);
}
} else {
const Xbyak::Reg8 to_store = ctx.reg_alloc.UseGpr(code, args[0]).cvt8();
const Xbyak::Reg8 to_store = ctx.reg_alloc.UseGpr(args[0]).cvt8();
code.or_(code.byte[code.ABI_JIT_PTR + offsetof(A32JitState, cpsr_q)], to_store);
}
}
void A32EmitX64::EmitA32GetGEFlags(A32EmitContext& ctx, IR::Inst* inst) {
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm();
code.movd(result, dword[code.ABI_JIT_PTR + offsetof(A32JitState, cpsr_ge)]);
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
void A32EmitX64::EmitA32SetGEFlags(A32EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
ASSERT(!args[0].IsImmediate());
if (args[0].IsInXmm(ctx.reg_alloc)) {
const Xbyak::Xmm to_store = ctx.reg_alloc.UseXmm(code, args[0]);
if (args[0].IsInXmm()) {
const Xbyak::Xmm to_store = ctx.reg_alloc.UseXmm(args[0]);
code.movd(dword[code.ABI_JIT_PTR + offsetof(A32JitState, cpsr_ge)], to_store);
} else {
const Xbyak::Reg32 to_store = ctx.reg_alloc.UseGpr(code, args[0]).cvt32();
const Xbyak::Reg32 to_store = ctx.reg_alloc.UseGpr(args[0]).cvt32();
code.mov(dword[code.ABI_JIT_PTR + offsetof(A32JitState, cpsr_ge)], to_store);
}
}
@@ -656,8 +656,8 @@ void A32EmitX64::EmitA32SetGEFlagsCompressed(A32EmitContext& ctx, IR::Inst* inst
code.mov(dword[code.ABI_JIT_PTR + offsetof(A32JitState, cpsr_ge)], ge);
} else if (code.HasHostFeature(HostFeature::FastBMI2)) {
const Xbyak::Reg32 a = ctx.reg_alloc.UseScratchGpr(code, args[0]).cvt32();
const Xbyak::Reg32 b = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 a = ctx.reg_alloc.UseScratchGpr(args[0]).cvt32();
const Xbyak::Reg32 b = ctx.reg_alloc.ScratchGpr().cvt32();
code.mov(b, 0x01010101);
code.shr(a, 16);
@@ -665,7 +665,7 @@ void A32EmitX64::EmitA32SetGEFlagsCompressed(A32EmitContext& ctx, IR::Inst* inst
code.imul(a, a, 0xFF);
code.mov(dword[code.ABI_JIT_PTR + offsetof(A32JitState, cpsr_ge)], a);
} else {
const Xbyak::Reg32 a = ctx.reg_alloc.UseScratchGpr(code, args[0]).cvt32();
const Xbyak::Reg32 a = ctx.reg_alloc.UseScratchGpr(args[0]).cvt32();
code.shr(a, 16);
code.and_(a, 0xF);
@@ -690,7 +690,7 @@ void A32EmitX64::EmitA32InstructionSynchronizationBarrier(A32EmitContext& ctx, I
return;
}
ctx.reg_alloc.HostCall(code, nullptr);
ctx.reg_alloc.HostCall(nullptr);
Devirtualize<&A32::UserCallbacks::InstructionSynchronizationBarrierRaised>(conf.callbacks).EmitCall(code);
}
@@ -718,9 +718,9 @@ void A32EmitX64::EmitA32BXWritePC(A32EmitContext& ctx, IR::Inst* inst) {
code.mov(MJitStateReg(A32::Reg::PC), new_pc & mask);
code.mov(dword[code.ABI_JIT_PTR + offsetof(A32JitState, upper_location_descriptor)], new_upper);
} else {
const Xbyak::Reg32 new_pc = ctx.reg_alloc.UseScratchGpr(code, arg).cvt32();
const Xbyak::Reg32 mask = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 new_upper = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 new_pc = ctx.reg_alloc.UseScratchGpr(arg).cvt32();
const Xbyak::Reg32 mask = ctx.reg_alloc.ScratchGpr().cvt32();
const Xbyak::Reg32 new_upper = ctx.reg_alloc.ScratchGpr().cvt32();
code.mov(mask, new_pc);
code.and_(mask, 1);
@@ -745,7 +745,7 @@ void A32EmitX64::EmitA32CallSupervisor(A32EmitContext& ctx, IR::Inst* inst) {
code.SwitchMxcsrOnExit();
if (conf.enable_cycle_counting) {
ctx.reg_alloc.HostCall(code, nullptr);
ctx.reg_alloc.HostCall(nullptr);
code.mov(code.ABI_PARAM2, qword[rsp + ABI_SHADOW_SPACE + offsetof(StackLayout, cycles_to_run)]);
code.sub(code.ABI_PARAM2, qword[rsp + ABI_SHADOW_SPACE + offsetof(StackLayout, cycles_remaining)]);
Devirtualize<&A32::UserCallbacks::AddTicks>(conf.callbacks).EmitCall(code);
@@ -753,7 +753,7 @@ void A32EmitX64::EmitA32CallSupervisor(A32EmitContext& ctx, IR::Inst* inst) {
}
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
ctx.reg_alloc.HostCall(code, nullptr, {}, args[0]);
ctx.reg_alloc.HostCall(nullptr, {}, args[0]);
Devirtualize<&A32::UserCallbacks::CallSVC>(conf.callbacks).EmitCall(code);
if (conf.enable_cycle_counting) {
@@ -767,7 +767,7 @@ void A32EmitX64::EmitA32CallSupervisor(A32EmitContext& ctx, IR::Inst* inst) {
void A32EmitX64::EmitA32ExceptionRaised(A32EmitContext& ctx, IR::Inst* inst) {
code.SwitchMxcsrOnExit();
ctx.reg_alloc.HostCall(code, nullptr);
ctx.reg_alloc.HostCall(nullptr);
if (conf.enable_cycle_counting) {
code.mov(code.ABI_PARAM2, qword[rsp + ABI_SHADOW_SPACE + offsetof(StackLayout, cycles_to_run)]);
code.sub(code.ABI_PARAM2, qword[rsp + ABI_SHADOW_SPACE + offsetof(StackLayout, cycles_remaining)]);
@@ -797,7 +797,7 @@ static u32 GetFpscrImpl(A32JitState* jit_state) {
}
void A32EmitX64::EmitA32GetFpscr(A32EmitContext& ctx, IR::Inst* inst) {
ctx.reg_alloc.HostCall(code, inst);
ctx.reg_alloc.HostCall(inst);
code.mov(code.ABI_PARAM1, code.ABI_JIT_PTR);
code.stmxcsr(code.dword[code.ABI_JIT_PTR + offsetof(A32JitState, guest_MXCSR)]);
@@ -810,7 +810,7 @@ static void SetFpscrImpl(u32 value, A32JitState* jit_state) {
void A32EmitX64::EmitA32SetFpscr(A32EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
ctx.reg_alloc.HostCall(code, nullptr, args[0]);
ctx.reg_alloc.HostCall(nullptr, args[0]);
code.mov(code.ABI_PARAM2, code.ABI_JIT_PTR);
code.CallFunction(&SetFpscrImpl);
@@ -818,17 +818,17 @@ void A32EmitX64::EmitA32SetFpscr(A32EmitContext& ctx, IR::Inst* inst) {
}
void A32EmitX64::EmitA32GetFpscrNZCV(A32EmitContext& ctx, IR::Inst* inst) {
const Xbyak::Reg32 result = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 result = ctx.reg_alloc.ScratchGpr().cvt32();
code.mov(result, dword[code.ABI_JIT_PTR + offsetof(A32JitState, fpsr_nzcv)]);
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
void A32EmitX64::EmitA32SetFpscrNZCV(A32EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
if (code.HasHostFeature(HostFeature::FastBMI2)) {
const Xbyak::Reg32 value = ctx.reg_alloc.UseGpr(code, args[0]).cvt32();
const Xbyak::Reg32 tmp = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 value = ctx.reg_alloc.UseGpr(args[0]).cvt32();
const Xbyak::Reg32 tmp = ctx.reg_alloc.ScratchGpr().cvt32();
code.mov(tmp, NZCV::x64_mask);
code.pext(tmp, value, tmp);
@@ -838,7 +838,7 @@ void A32EmitX64::EmitA32SetFpscrNZCV(A32EmitContext& ctx, IR::Inst* inst) {
return;
}
const Xbyak::Reg32 value = ctx.reg_alloc.UseScratchGpr(code, args[0]).cvt32();
const Xbyak::Reg32 value = ctx.reg_alloc.UseScratchGpr(args[0]).cvt32();
code.and_(value, NZCV::x64_mask);
code.imul(value, value, NZCV::from_x64_multiplier);
@@ -851,7 +851,7 @@ static void EmitCoprocessorException() {
}
static void CallCoprocCallback(BlockOfCode& code, RegAlloc& reg_alloc, A32::Coprocessor::Callback callback, IR::Inst* inst = nullptr, std::optional<Argument::copyable_reference> arg0 = {}, std::optional<Argument::copyable_reference> arg1 = {}) {
reg_alloc.HostCall(code, inst, {}, arg0, arg1);
reg_alloc.HostCall(inst, {}, arg0, arg1);
if (callback.user_arg) {
code.mov(code.ABI_PARAM1, reinterpret_cast<u64>(*callback.user_arg));
@@ -914,8 +914,8 @@ void A32EmitX64::EmitA32CoprocSendOneWord(A32EmitContext& ctx, IR::Inst* inst) {
}
if (const auto destination_ptr = std::get_if<u32*>(&action)) {
const Xbyak::Reg32 reg_word = ctx.reg_alloc.UseGpr(code, args[1]).cvt32();
const Xbyak::Reg64 reg_destination_addr = ctx.reg_alloc.ScratchGpr(code);
const Xbyak::Reg32 reg_word = ctx.reg_alloc.UseGpr(args[1]).cvt32();
const Xbyak::Reg64 reg_destination_addr = ctx.reg_alloc.ScratchGpr();
code.mov(reg_destination_addr, reinterpret_cast<u64>(*destination_ptr));
code.mov(code.dword[reg_destination_addr], reg_word);
@@ -954,9 +954,9 @@ void A32EmitX64::EmitA32CoprocSendTwoWords(A32EmitContext& ctx, IR::Inst* inst)
}
if (const auto destination_ptrs = std::get_if<std::array<u32*, 2>>(&action)) {
const Xbyak::Reg32 reg_word1 = ctx.reg_alloc.UseGpr(code, args[1]).cvt32();
const Xbyak::Reg32 reg_word2 = ctx.reg_alloc.UseGpr(code, args[2]).cvt32();
const Xbyak::Reg64 reg_destination_addr = ctx.reg_alloc.ScratchGpr(code);
const Xbyak::Reg32 reg_word1 = ctx.reg_alloc.UseGpr(args[1]).cvt32();
const Xbyak::Reg32 reg_word2 = ctx.reg_alloc.UseGpr(args[2]).cvt32();
const Xbyak::Reg64 reg_destination_addr = ctx.reg_alloc.ScratchGpr();
code.mov(reg_destination_addr, reinterpret_cast<u64>((*destination_ptrs)[0]));
code.mov(code.dword[reg_destination_addr], reg_word1);
@@ -998,13 +998,13 @@ void A32EmitX64::EmitA32CoprocGetOneWord(A32EmitContext& ctx, IR::Inst* inst) {
}
if (const auto source_ptr = std::get_if<u32*>(&action)) {
const Xbyak::Reg32 reg_word = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg64 reg_source_addr = ctx.reg_alloc.ScratchGpr(code);
const Xbyak::Reg32 reg_word = ctx.reg_alloc.ScratchGpr().cvt32();
const Xbyak::Reg64 reg_source_addr = ctx.reg_alloc.ScratchGpr();
code.mov(reg_source_addr, reinterpret_cast<u64>(*source_ptr));
code.mov(reg_word, code.dword[reg_source_addr]);
ctx.reg_alloc.DefineValue(code, inst, reg_word);
ctx.reg_alloc.DefineValue(inst, reg_word);
return;
}
@@ -1038,9 +1038,9 @@ void A32EmitX64::EmitA32CoprocGetTwoWords(A32EmitContext& ctx, IR::Inst* inst) {
}
if (const auto source_ptrs = std::get_if<std::array<u32*, 2>>(&action)) {
const Xbyak::Reg64 reg_result = ctx.reg_alloc.ScratchGpr(code);
const Xbyak::Reg64 reg_destination_addr = ctx.reg_alloc.ScratchGpr(code);
const Xbyak::Reg64 reg_tmp = ctx.reg_alloc.ScratchGpr(code);
const Xbyak::Reg64 reg_result = ctx.reg_alloc.ScratchGpr();
const Xbyak::Reg64 reg_destination_addr = ctx.reg_alloc.ScratchGpr();
const Xbyak::Reg64 reg_tmp = ctx.reg_alloc.ScratchGpr();
code.mov(reg_destination_addr, reinterpret_cast<u64>((*source_ptrs)[1]));
code.mov(reg_result.cvt32(), code.dword[reg_destination_addr]);
@@ -1049,7 +1049,7 @@ void A32EmitX64::EmitA32CoprocGetTwoWords(A32EmitContext& ctx, IR::Inst* inst) {
code.mov(reg_tmp.cvt32(), code.dword[reg_destination_addr]);
code.or_(reg_result, reg_tmp);
ctx.reg_alloc.DefineValue(code, inst, reg_result);
ctx.reg_alloc.DefineValue(inst, reg_result);
return;
}

View File

@@ -91,7 +91,7 @@ A64EmitX64::BlockDescriptor A64EmitX64::Emit(IR::Block& block) noexcept {
return gprs;
}();
new (&this->reg_alloc) RegAlloc{gpr_order, any_xmm};
new (&this->reg_alloc) RegAlloc{&code, gpr_order, any_xmm};
A64EmitContext ctx{conf, reg_alloc, block};
// Start emitting.
@@ -159,7 +159,7 @@ finish_this_inst:
}
code.int3();
const size_t size = size_t(code.getCurr() - entrypoint);
const size_t size = static_cast<size_t>(code.getCurr() - entrypoint);
const A64::LocationDescriptor descriptor{block.Location()};
const A64::LocationDescriptor end_location{block.EndLocation()};
@@ -266,25 +266,25 @@ void A64EmitX64::EmitPushRSB(EmitContext& ctx, IR::Inst* inst) {
void A64EmitX64::EmitA64SetCheckBit(A64EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Reg8 to_store = ctx.reg_alloc.UseGpr(code, args[0]).cvt8();
const Xbyak::Reg8 to_store = ctx.reg_alloc.UseGpr(args[0]).cvt8();
code.mov(code.byte[rsp + ABI_SHADOW_SPACE + offsetof(StackLayout, check_bit)], to_store);
}
void A64EmitX64::EmitA64GetCFlag(A64EmitContext& ctx, IR::Inst* inst) {
const Xbyak::Reg32 result = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 result = ctx.reg_alloc.ScratchGpr().cvt32();
code.mov(result, dword[code.ABI_JIT_PTR + offsetof(A64JitState, cpsr_nzcv)]);
code.shr(result, NZCV::x64_c_flag_bit);
code.and_(result, 1);
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
void A64EmitX64::EmitA64GetNZCVRaw(A64EmitContext& ctx, IR::Inst* inst) {
const Xbyak::Reg32 nzcv_raw = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 nzcv_raw = ctx.reg_alloc.ScratchGpr().cvt32();
code.mov(nzcv_raw, dword[code.ABI_JIT_PTR + offsetof(A64JitState, cpsr_nzcv)]);
if (code.HasHostFeature(HostFeature::FastBMI2)) {
const Xbyak::Reg32 tmp = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 tmp = ctx.reg_alloc.ScratchGpr().cvt32();
code.mov(tmp, NZCV::x64_mask);
code.pext(nzcv_raw, nzcv_raw, tmp);
code.shl(nzcv_raw, 28);
@@ -294,16 +294,16 @@ void A64EmitX64::EmitA64GetNZCVRaw(A64EmitContext& ctx, IR::Inst* inst) {
code.and_(nzcv_raw, NZCV::arm_mask);
}
ctx.reg_alloc.DefineValue(code, inst, nzcv_raw);
ctx.reg_alloc.DefineValue(inst, nzcv_raw);
}
void A64EmitX64::EmitA64SetNZCVRaw(A64EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Reg32 nzcv_raw = ctx.reg_alloc.UseScratchGpr(code, args[0]).cvt32();
const Xbyak::Reg32 nzcv_raw = ctx.reg_alloc.UseScratchGpr(args[0]).cvt32();
code.shr(nzcv_raw, 28);
if (code.HasHostFeature(HostFeature::FastBMI2)) {
const Xbyak::Reg32 tmp = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 tmp = ctx.reg_alloc.ScratchGpr().cvt32();
code.mov(tmp, NZCV::x64_mask);
code.pdep(nzcv_raw, nzcv_raw, tmp);
} else {
@@ -315,63 +315,63 @@ void A64EmitX64::EmitA64SetNZCVRaw(A64EmitContext& ctx, IR::Inst* inst) {
void A64EmitX64::EmitA64SetNZCV(A64EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Reg32 to_store = ctx.reg_alloc.UseScratchGpr(code, args[0]).cvt32();
const Xbyak::Reg32 to_store = ctx.reg_alloc.UseScratchGpr(args[0]).cvt32();
code.mov(dword[code.ABI_JIT_PTR + offsetof(A64JitState, cpsr_nzcv)], to_store);
}
void A64EmitX64::EmitA64GetW(A64EmitContext& ctx, IR::Inst* inst) {
const A64::Reg reg = inst->GetArg(0).GetA64RegRef();
const Xbyak::Reg32 result = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 result = ctx.reg_alloc.ScratchGpr().cvt32();
code.mov(result, dword[code.ABI_JIT_PTR + offsetof(A64JitState, reg) + sizeof(u64) * static_cast<size_t>(reg)]);
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
void A64EmitX64::EmitA64GetX(A64EmitContext& ctx, IR::Inst* inst) {
const A64::Reg reg = inst->GetArg(0).GetA64RegRef();
const Xbyak::Reg64 result = ctx.reg_alloc.ScratchGpr(code);
const Xbyak::Reg64 result = ctx.reg_alloc.ScratchGpr();
code.mov(result, qword[code.ABI_JIT_PTR + offsetof(A64JitState, reg) + sizeof(u64) * static_cast<size_t>(reg)]);
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
void A64EmitX64::EmitA64GetS(A64EmitContext& ctx, IR::Inst* inst) {
const A64::Vec vec = inst->GetArg(0).GetA64VecRef();
const auto addr = qword[code.ABI_JIT_PTR + offsetof(A64JitState, vec) + sizeof(u64) * 2 * static_cast<size_t>(vec)];
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm();
code.movd(result, addr);
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
void A64EmitX64::EmitA64GetD(A64EmitContext& ctx, IR::Inst* inst) {
const A64::Vec vec = inst->GetArg(0).GetA64VecRef();
const auto addr = qword[code.ABI_JIT_PTR + offsetof(A64JitState, vec) + sizeof(u64) * 2 * static_cast<size_t>(vec)];
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm();
code.movq(result, addr);
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
void A64EmitX64::EmitA64GetQ(A64EmitContext& ctx, IR::Inst* inst) {
const A64::Vec vec = inst->GetArg(0).GetA64VecRef();
const auto addr = xword[code.ABI_JIT_PTR + offsetof(A64JitState, vec) + sizeof(u64) * 2 * static_cast<size_t>(vec)];
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm();
code.movaps(result, addr);
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
void A64EmitX64::EmitA64GetSP(A64EmitContext& ctx, IR::Inst* inst) {
const Xbyak::Reg64 result = ctx.reg_alloc.ScratchGpr(code);
const Xbyak::Reg64 result = ctx.reg_alloc.ScratchGpr();
code.mov(result, qword[code.ABI_JIT_PTR + offsetof(A64JitState, sp)]);
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
void A64EmitX64::EmitA64GetFPCR(A64EmitContext& ctx, IR::Inst* inst) {
const Xbyak::Reg32 result = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 result = ctx.reg_alloc.ScratchGpr().cvt32();
code.mov(result, dword[code.ABI_JIT_PTR + offsetof(A64JitState, fpcr)]);
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
static u32 GetFPSRImpl(A64JitState* jit_state) {
@@ -379,7 +379,7 @@ static u32 GetFPSRImpl(A64JitState* jit_state) {
}
void A64EmitX64::EmitA64GetFPSR(A64EmitContext& ctx, IR::Inst* inst) {
ctx.reg_alloc.HostCall(code, inst);
ctx.reg_alloc.HostCall(inst);
code.mov(code.ABI_PARAM1, code.ABI_JIT_PTR);
code.stmxcsr(code.dword[code.ABI_JIT_PTR + offsetof(A64JitState, guest_MXCSR)]);
code.CallFunction(GetFPSRImpl);
@@ -393,7 +393,7 @@ void A64EmitX64::EmitA64SetW(A64EmitContext& ctx, IR::Inst* inst) {
code.mov(addr, args[1].GetImmediateS32());
} else {
// TODO: zext tracking, xmm variant
const Xbyak::Reg64 to_store = ctx.reg_alloc.UseScratchGpr(code, args[1]);
const Xbyak::Reg64 to_store = ctx.reg_alloc.UseScratchGpr(args[1]);
code.mov(to_store.cvt32(), to_store.cvt32());
code.mov(addr, to_store);
}
@@ -405,11 +405,11 @@ void A64EmitX64::EmitA64SetX(A64EmitContext& ctx, IR::Inst* inst) {
const auto addr = qword[code.ABI_JIT_PTR + offsetof(A64JitState, reg) + sizeof(u64) * static_cast<size_t>(reg)];
if (args[1].FitsInImmediateS32()) {
code.mov(addr, args[1].GetImmediateS32());
} else if (args[1].IsInXmm(ctx.reg_alloc)) {
const Xbyak::Xmm to_store = ctx.reg_alloc.UseXmm(code, args[1]);
} else if (args[1].IsInXmm()) {
const Xbyak::Xmm to_store = ctx.reg_alloc.UseXmm(args[1]);
code.movq(addr, to_store);
} else {
const Xbyak::Reg64 to_store = ctx.reg_alloc.UseGpr(code, args[1]);
const Xbyak::Reg64 to_store = ctx.reg_alloc.UseGpr(args[1]);
code.mov(addr, to_store);
}
}
@@ -419,8 +419,8 @@ void A64EmitX64::EmitA64SetS(A64EmitContext& ctx, IR::Inst* inst) {
const A64::Vec vec = inst->GetArg(0).GetA64VecRef();
const auto addr = xword[code.ABI_JIT_PTR + offsetof(A64JitState, vec) + sizeof(u64) * 2 * static_cast<size_t>(vec)];
const Xbyak::Xmm to_store = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm tmp = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm to_store = ctx.reg_alloc.UseXmm(args[1]);
const Xbyak::Xmm tmp = ctx.reg_alloc.ScratchXmm();
// TODO: Optimize
code.pxor(tmp, tmp);
code.movss(tmp, to_store);
@@ -432,7 +432,7 @@ void A64EmitX64::EmitA64SetD(A64EmitContext& ctx, IR::Inst* inst) {
const A64::Vec vec = inst->GetArg(0).GetA64VecRef();
const auto addr = xword[code.ABI_JIT_PTR + offsetof(A64JitState, vec) + sizeof(u64) * 2 * static_cast<size_t>(vec)];
const Xbyak::Xmm to_store = ctx.reg_alloc.UseScratchXmm(code, args[1]);
const Xbyak::Xmm to_store = ctx.reg_alloc.UseScratchXmm(args[1]);
code.movq(to_store, to_store); // TODO: Remove when able
code.movaps(addr, to_store);
}
@@ -442,7 +442,7 @@ void A64EmitX64::EmitA64SetQ(A64EmitContext& ctx, IR::Inst* inst) {
const A64::Vec vec = inst->GetArg(0).GetA64VecRef();
const auto addr = xword[code.ABI_JIT_PTR + offsetof(A64JitState, vec) + sizeof(u64) * 2 * static_cast<size_t>(vec)];
const Xbyak::Xmm to_store = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm to_store = ctx.reg_alloc.UseXmm(args[1]);
code.movaps(addr, to_store);
}
@@ -451,11 +451,11 @@ void A64EmitX64::EmitA64SetSP(A64EmitContext& ctx, IR::Inst* inst) {
const auto addr = qword[code.ABI_JIT_PTR + offsetof(A64JitState, sp)];
if (args[0].FitsInImmediateS32()) {
code.mov(addr, args[0].GetImmediateS32());
} else if (args[0].IsInXmm(ctx.reg_alloc)) {
const Xbyak::Xmm to_store = ctx.reg_alloc.UseXmm(code, args[0]);
} else if (args[0].IsInXmm()) {
const Xbyak::Xmm to_store = ctx.reg_alloc.UseXmm(args[0]);
code.movq(addr, to_store);
} else {
const Xbyak::Reg64 to_store = ctx.reg_alloc.UseGpr(code, args[0]);
const Xbyak::Reg64 to_store = ctx.reg_alloc.UseGpr(args[0]);
code.mov(addr, to_store);
}
}
@@ -466,7 +466,7 @@ static void SetFPCRImpl(A64JitState* jit_state, u32 value) {
void A64EmitX64::EmitA64SetFPCR(A64EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
ctx.reg_alloc.HostCall(code, nullptr, {}, args[0]);
ctx.reg_alloc.HostCall(nullptr, {}, args[0]);
code.mov(code.ABI_PARAM1, code.ABI_JIT_PTR);
code.CallFunction(SetFPCRImpl);
code.ldmxcsr(code.dword[code.ABI_JIT_PTR + offsetof(A64JitState, guest_MXCSR)]);
@@ -478,7 +478,7 @@ static void SetFPSRImpl(A64JitState* jit_state, u32 value) {
void A64EmitX64::EmitA64SetFPSR(A64EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
ctx.reg_alloc.HostCall(code, nullptr, {}, args[0]);
ctx.reg_alloc.HostCall(nullptr, {}, args[0]);
code.mov(code.ABI_PARAM1, code.ABI_JIT_PTR);
code.CallFunction(SetFPSRImpl);
code.ldmxcsr(code.dword[code.ABI_JIT_PTR + offsetof(A64JitState, guest_MXCSR)]);
@@ -489,17 +489,17 @@ void A64EmitX64::EmitA64SetPC(A64EmitContext& ctx, IR::Inst* inst) {
const auto addr = qword[code.ABI_JIT_PTR + offsetof(A64JitState, pc)];
if (args[0].FitsInImmediateS32()) {
code.mov(addr, args[0].GetImmediateS32());
} else if (args[0].IsInXmm(ctx.reg_alloc)) {
const Xbyak::Xmm to_store = ctx.reg_alloc.UseXmm(code, args[0]);
} else if (args[0].IsInXmm()) {
const Xbyak::Xmm to_store = ctx.reg_alloc.UseXmm(args[0]);
code.movq(addr, to_store);
} else {
const Xbyak::Reg64 to_store = ctx.reg_alloc.UseGpr(code, args[0]);
const Xbyak::Reg64 to_store = ctx.reg_alloc.UseGpr(args[0]);
code.mov(addr, to_store);
}
}
void A64EmitX64::EmitA64CallSupervisor(A64EmitContext& ctx, IR::Inst* inst) {
ctx.reg_alloc.HostCall(code, nullptr);
ctx.reg_alloc.HostCall(nullptr);
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
ASSERT(args[0].IsImmediate());
const u32 imm = args[0].GetImmediateU32();
@@ -511,7 +511,7 @@ void A64EmitX64::EmitA64CallSupervisor(A64EmitContext& ctx, IR::Inst* inst) {
}
void A64EmitX64::EmitA64ExceptionRaised(A64EmitContext& ctx, IR::Inst* inst) {
ctx.reg_alloc.HostCall(code, nullptr);
ctx.reg_alloc.HostCall(nullptr);
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
ASSERT(args[0].IsImmediate() && args[1].IsImmediate());
const u64 pc = args[0].GetImmediateU64();
@@ -524,13 +524,13 @@ void A64EmitX64::EmitA64ExceptionRaised(A64EmitContext& ctx, IR::Inst* inst) {
void A64EmitX64::EmitA64DataCacheOperationRaised(A64EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
ctx.reg_alloc.HostCall(code, nullptr, {}, args[1], args[2]);
ctx.reg_alloc.HostCall(nullptr, {}, args[1], args[2]);
Devirtualize<&A64::UserCallbacks::DataCacheOperationRaised>(conf.callbacks).EmitCall(code);
}
void A64EmitX64::EmitA64InstructionCacheOperationRaised(A64EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
ctx.reg_alloc.HostCall(code, nullptr, {}, args[0], args[1]);
ctx.reg_alloc.HostCall(nullptr, {}, args[0], args[1]);
Devirtualize<&A64::UserCallbacks::InstructionCacheOperationRaised>(conf.callbacks).EmitCall(code);
}
@@ -548,18 +548,18 @@ void A64EmitX64::EmitA64InstructionSynchronizationBarrier(A64EmitContext& ctx, I
return;
}
ctx.reg_alloc.HostCall(code, nullptr);
ctx.reg_alloc.HostCall(nullptr);
Devirtualize<&A64::UserCallbacks::InstructionSynchronizationBarrierRaised>(conf.callbacks).EmitCall(code);
}
void A64EmitX64::EmitA64GetCNTFRQ(A64EmitContext& ctx, IR::Inst* inst) {
const Xbyak::Reg32 result = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 result = ctx.reg_alloc.ScratchGpr().cvt32();
code.mov(result, conf.cntfrq_el0);
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
void A64EmitX64::EmitA64GetCNTPCT(A64EmitContext& ctx, IR::Inst* inst) {
ctx.reg_alloc.HostCall(code, inst);
ctx.reg_alloc.HostCall(inst);
if (!conf.wall_clock_cntpct) {
code.UpdateTicks();
}
@@ -567,43 +567,43 @@ void A64EmitX64::EmitA64GetCNTPCT(A64EmitContext& ctx, IR::Inst* inst) {
}
void A64EmitX64::EmitA64GetCTR(A64EmitContext& ctx, IR::Inst* inst) {
const Xbyak::Reg32 result = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 result = ctx.reg_alloc.ScratchGpr().cvt32();
code.mov(result, conf.ctr_el0);
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
void A64EmitX64::EmitA64GetDCZID(A64EmitContext& ctx, IR::Inst* inst) {
const Xbyak::Reg32 result = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 result = ctx.reg_alloc.ScratchGpr().cvt32();
code.mov(result, conf.dczid_el0);
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
void A64EmitX64::EmitA64GetTPIDR(A64EmitContext& ctx, IR::Inst* inst) {
const Xbyak::Reg64 result = ctx.reg_alloc.ScratchGpr(code);
const Xbyak::Reg64 result = ctx.reg_alloc.ScratchGpr();
if (conf.tpidr_el0) {
code.mov(result, u64(conf.tpidr_el0));
code.mov(result, qword[result]);
} else {
code.xor_(result.cvt32(), result.cvt32());
}
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
void A64EmitX64::EmitA64GetTPIDRRO(A64EmitContext& ctx, IR::Inst* inst) {
const Xbyak::Reg64 result = ctx.reg_alloc.ScratchGpr(code);
const Xbyak::Reg64 result = ctx.reg_alloc.ScratchGpr();
if (conf.tpidrro_el0) {
code.mov(result, u64(conf.tpidrro_el0));
code.mov(result, qword[result]);
} else {
code.xor_(result.cvt32(), result.cvt32());
}
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
void A64EmitX64::EmitA64SetTPIDR(A64EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Reg64 value = ctx.reg_alloc.UseGpr(code, args[0]);
const Xbyak::Reg64 addr = ctx.reg_alloc.ScratchGpr(code);
const Xbyak::Reg64 value = ctx.reg_alloc.UseGpr(args[0]);
const Xbyak::Reg64 addr = ctx.reg_alloc.ScratchGpr();
if (conf.tpidr_el0) {
code.mov(addr, u64(conf.tpidr_el0));
code.mov(qword[addr], value);

View File

@@ -68,7 +68,7 @@ void EmitX64::EmitVoid(EmitContext&, IR::Inst*) {
void EmitX64::EmitIdentity(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
if (!args[0].IsImmediate()) {
ctx.reg_alloc.DefineValue(code, inst, args[0]);
ctx.reg_alloc.DefineValue(inst, args[0]);
}
}
@@ -78,7 +78,7 @@ void EmitX64::EmitBreakpoint(EmitContext&, IR::Inst*) {
void EmitX64::EmitCallHostFunction(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
ctx.reg_alloc.HostCall(code, nullptr, args[1], args[2], args[3]);
ctx.reg_alloc.HostCall(nullptr, args[1], args[2], args[3]);
code.mov(rax, args[0].GetImmediateU64());
code.call(rax);
}
@@ -120,7 +120,7 @@ void EmitX64::EmitVerboseDebuggingOutput(RegAlloc& reg_alloc) {
code.lea(rax, ptr[rsp + sizeof(RegisterData) + offsetof(StackLayout, spill)]);
code.mov(qword[rsp + offsetof(RegisterData, spill)], rax);
reg_alloc.EmitVerboseDebuggingOutput(code);
reg_alloc.EmitVerboseDebuggingOutput();
for (int i = 0; i < 16; i++) {
if (rsp.getIdx() == i) {
@@ -140,9 +140,9 @@ void EmitX64::EmitPushRSB(EmitContext& ctx, IR::Inst* inst) {
ASSERT(args[0].IsImmediate());
const u64 unique_hash_of_target = args[0].GetImmediateU64();
ctx.reg_alloc.ScratchGpr(code, HostLoc::RCX);
const Xbyak::Reg64 loc_desc_reg = ctx.reg_alloc.ScratchGpr(code);
const Xbyak::Reg64 index_reg = ctx.reg_alloc.ScratchGpr(code);
ctx.reg_alloc.ScratchGpr(HostLoc::RCX);
const Xbyak::Reg64 loc_desc_reg = ctx.reg_alloc.ScratchGpr();
const Xbyak::Reg64 index_reg = ctx.reg_alloc.ScratchGpr();
PushRSBHelper(loc_desc_reg, index_reg, IR::LocationDescriptor{unique_hash_of_target});
}
@@ -190,12 +190,12 @@ void EmitX64::EmitGetNZFromOp(EmitContext& ctx, IR::Inst* inst) {
}
}();
const Xbyak::Reg64 nz = ctx.reg_alloc.ScratchGpr(code, HostLoc::RAX);
const Xbyak::Reg value = ctx.reg_alloc.UseGpr(code, args[0]).changeBit(bitsize);
const Xbyak::Reg64 nz = ctx.reg_alloc.ScratchGpr(HostLoc::RAX);
const Xbyak::Reg value = ctx.reg_alloc.UseGpr(args[0]).changeBit(bitsize);
code.test(value, value);
code.lahf();
code.movzx(eax, ah);
ctx.reg_alloc.DefineValue(code, inst, nz);
ctx.reg_alloc.DefineValue(inst, nz);
}
void EmitX64::EmitGetNZCVFromOp(EmitContext& ctx, IR::Inst* inst) {
@@ -221,27 +221,27 @@ void EmitX64::EmitGetNZCVFromOp(EmitContext& ctx, IR::Inst* inst) {
}
}();
const Xbyak::Reg64 nzcv = ctx.reg_alloc.ScratchGpr(code, HostLoc::RAX);
const Xbyak::Reg value = ctx.reg_alloc.UseGpr(code, args[0]).changeBit(bitsize);
const Xbyak::Reg64 nzcv = ctx.reg_alloc.ScratchGpr(HostLoc::RAX);
const Xbyak::Reg value = ctx.reg_alloc.UseGpr(args[0]).changeBit(bitsize);
code.test(value, value);
code.lahf();
code.xor_(al, al);
ctx.reg_alloc.DefineValue(code, inst, nzcv);
ctx.reg_alloc.DefineValue(inst, nzcv);
}
void EmitX64::EmitGetCFlagFromNZCV(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
if (args[0].IsImmediate()) {
const Xbyak::Reg32 result = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 result = ctx.reg_alloc.ScratchGpr().cvt32();
const u32 value = (args[0].GetImmediateU32() >> 8) & 1;
code.mov(result, value);
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
} else {
const Xbyak::Reg32 result = ctx.reg_alloc.UseScratchGpr(code, args[0]).cvt32();
const Xbyak::Reg32 result = ctx.reg_alloc.UseScratchGpr(args[0]).cvt32();
code.shr(result, 8);
code.and_(result, 1);
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
}
@@ -249,30 +249,30 @@ void EmitX64::EmitNZCVFromPackedFlags(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
if (args[0].IsImmediate()) {
const Xbyak::Reg32 nzcv = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 nzcv = ctx.reg_alloc.ScratchGpr().cvt32();
u32 value = 0;
value |= mcl::bit::get_bit<31>(args[0].GetImmediateU32()) ? (1 << 15) : 0;
value |= mcl::bit::get_bit<30>(args[0].GetImmediateU32()) ? (1 << 14) : 0;
value |= mcl::bit::get_bit<29>(args[0].GetImmediateU32()) ? (1 << 8) : 0;
value |= mcl::bit::get_bit<28>(args[0].GetImmediateU32()) ? (1 << 0) : 0;
code.mov(nzcv, value);
ctx.reg_alloc.DefineValue(code, inst, nzcv);
ctx.reg_alloc.DefineValue(inst, nzcv);
} else if (code.HasHostFeature(HostFeature::FastBMI2)) {
const Xbyak::Reg32 nzcv = ctx.reg_alloc.UseScratchGpr(code, args[0]).cvt32();
const Xbyak::Reg32 tmp = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 nzcv = ctx.reg_alloc.UseScratchGpr(args[0]).cvt32();
const Xbyak::Reg32 tmp = ctx.reg_alloc.ScratchGpr().cvt32();
code.shr(nzcv, 28);
code.mov(tmp, NZCV::x64_mask);
code.pdep(nzcv, nzcv, tmp);
ctx.reg_alloc.DefineValue(code, inst, nzcv);
ctx.reg_alloc.DefineValue(inst, nzcv);
} else {
const Xbyak::Reg32 nzcv = ctx.reg_alloc.UseScratchGpr(code, args[0]).cvt32();
const Xbyak::Reg32 nzcv = ctx.reg_alloc.UseScratchGpr(args[0]).cvt32();
code.shr(nzcv, 28);
code.imul(nzcv, nzcv, NZCV::to_x64_multiplier);
code.and_(nzcv, NZCV::x64_mask);
ctx.reg_alloc.DefineValue(code, inst, nzcv);
ctx.reg_alloc.DefineValue(inst, nzcv);
}
}

View File

@@ -23,13 +23,13 @@ using AESFn = void(AES::State&, const AES::State&);
static void EmitAESFunction(RegAlloc::ArgumentInfo args, EmitContext& ctx, BlockOfCode& code, IR::Inst* inst, AESFn fn) {
constexpr u32 stack_space = static_cast<u32>(sizeof(AES::State)) * 2;
const Xbyak::Xmm input = ctx.reg_alloc.UseXmm(code, args[0]);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm input = ctx.reg_alloc.UseXmm(args[0]);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm();
ctx.reg_alloc.EndOfAllocScope();
ctx.reg_alloc.HostCall(code, nullptr);
ctx.reg_alloc.HostCall(nullptr);
ctx.reg_alloc.AllocStackSpace(code, stack_space + ABI_SHADOW_SPACE);
ctx.reg_alloc.AllocStackSpace(stack_space + ABI_SHADOW_SPACE);
code.lea(code.ABI_PARAM1, ptr[rsp + ABI_SHADOW_SPACE]);
code.lea(code.ABI_PARAM2, ptr[rsp + ABI_SHADOW_SPACE + sizeof(AES::State)]);
@@ -37,22 +37,22 @@ static void EmitAESFunction(RegAlloc::ArgumentInfo args, EmitContext& ctx, Block
code.CallFunction(fn);
code.movaps(result, xword[rsp + ABI_SHADOW_SPACE]);
ctx.reg_alloc.ReleaseStackSpace(code, stack_space + ABI_SHADOW_SPACE);
ctx.reg_alloc.ReleaseStackSpace(stack_space + ABI_SHADOW_SPACE);
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
void EmitX64::EmitAESDecryptSingleRound(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
if (code.HasHostFeature(HostFeature::AES)) {
const Xbyak::Xmm data = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm zero = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm data = ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm zero = ctx.reg_alloc.ScratchXmm();
code.pxor(zero, zero);
code.aesdeclast(data, zero);
ctx.reg_alloc.DefineValue(code, inst, data);
ctx.reg_alloc.DefineValue(inst, data);
return;
}
@@ -63,13 +63,13 @@ void EmitX64::EmitAESEncryptSingleRound(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
if (code.HasHostFeature(HostFeature::AES)) {
const Xbyak::Xmm data = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm zero = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm data = ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm zero = ctx.reg_alloc.ScratchXmm();
code.pxor(zero, zero);
code.aesenclast(data, zero);
ctx.reg_alloc.DefineValue(code, inst, data);
ctx.reg_alloc.DefineValue(inst, data);
return;
}
@@ -80,11 +80,11 @@ void EmitX64::EmitAESInverseMixColumns(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
if (code.HasHostFeature(HostFeature::AES)) {
const Xbyak::Xmm data = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm data = ctx.reg_alloc.UseScratchXmm(args[0]);
code.aesimc(data, data);
ctx.reg_alloc.DefineValue(code, inst, data);
ctx.reg_alloc.DefineValue(inst, data);
return;
}
@@ -95,14 +95,14 @@ void EmitX64::EmitAESMixColumns(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
if (code.HasHostFeature(HostFeature::AES)) {
const Xbyak::Xmm data = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm zero = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm data = ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm zero = ctx.reg_alloc.ScratchXmm();
code.pxor(zero, zero);
code.aesdeclast(data, zero);
code.aesenc(data, zero);
ctx.reg_alloc.DefineValue(code, inst, data);
ctx.reg_alloc.DefineValue(inst, data);
return;
}

View File

@@ -1,6 +1,3 @@
// SPDX-FileCopyrightText: Copyright 2025 Eden Emulator Project
// SPDX-License-Identifier: GPL-3.0-or-later
/* This file is part of the dynarmic project.
* Copyright (c) 2018 MerryMage
* SPDX-License-Identifier: 0BSD
@@ -22,16 +19,16 @@ namespace CRC32 = Common::Crypto::CRC32;
static void EmitCRC32Castagnoli(BlockOfCode& code, EmitContext& ctx, IR::Inst* inst, const int data_size) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
if (code.HasHostFeature(HostFeature::SSE42)) {
const Xbyak::Reg32 crc = ctx.reg_alloc.UseScratchGpr(code, args[0]).cvt32();
const Xbyak::Reg value = ctx.reg_alloc.UseGpr(code, args[1]).changeBit(data_size);
const Xbyak::Reg32 crc = ctx.reg_alloc.UseScratchGpr(args[0]).cvt32();
const Xbyak::Reg value = ctx.reg_alloc.UseGpr(args[1]).changeBit(data_size);
if (data_size != 64) {
code.crc32(crc, value);
} else {
code.crc32(crc.cvt64(), value);
}
ctx.reg_alloc.DefineValue(code, inst, crc);
ctx.reg_alloc.DefineValue(inst, crc);
} else {
ctx.reg_alloc.HostCall(code, inst, args[0], args[1], {});
ctx.reg_alloc.HostCall(inst, args[0], args[1], {});
code.mov(code.ABI_PARAM3.cvt32(), data_size / CHAR_BIT); //zext
code.CallFunction(&CRC32::ComputeCRC32Castagnoli);
}
@@ -41,11 +38,11 @@ static void EmitCRC32ISO(BlockOfCode& code, EmitContext& ctx, IR::Inst* inst, co
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
if (code.HasHostFeature(HostFeature::PCLMULQDQ) && data_size < 32) {
const Xbyak::Reg32 crc = ctx.reg_alloc.UseScratchGpr(code, args[0]).cvt32();
const Xbyak::Reg64 value = ctx.reg_alloc.UseScratchGpr(code, args[1]);
const Xbyak::Xmm xmm_value = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm xmm_const = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm xmm_tmp = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Reg32 crc = ctx.reg_alloc.UseScratchGpr(args[0]).cvt32();
const Xbyak::Reg64 value = ctx.reg_alloc.UseScratchGpr(args[1]);
const Xbyak::Xmm xmm_value = ctx.reg_alloc.ScratchXmm();
const Xbyak::Xmm xmm_const = ctx.reg_alloc.ScratchXmm();
const Xbyak::Xmm xmm_tmp = ctx.reg_alloc.ScratchXmm();
code.movdqa(xmm_const, code.Const(xword, 0xb4e5b025'f7011641, 0x00000001'DB710641));
@@ -67,12 +64,12 @@ static void EmitCRC32ISO(BlockOfCode& code, EmitContext& ctx, IR::Inst* inst, co
code.pextrd(crc, xmm_value, 2);
ctx.reg_alloc.DefineValue(code, inst, crc);
ctx.reg_alloc.DefineValue(inst, crc);
} else if (code.HasHostFeature(HostFeature::PCLMULQDQ) && data_size == 32) {
const Xbyak::Reg32 crc = ctx.reg_alloc.UseScratchGpr(code, args[0]).cvt32();
const Xbyak::Reg32 value = ctx.reg_alloc.UseGpr(code, args[1]).cvt32();
const Xbyak::Xmm xmm_value = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm xmm_const = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Reg32 crc = ctx.reg_alloc.UseScratchGpr(args[0]).cvt32();
const Xbyak::Reg32 value = ctx.reg_alloc.UseGpr(args[1]).cvt32();
const Xbyak::Xmm xmm_value = ctx.reg_alloc.ScratchXmm();
const Xbyak::Xmm xmm_const = ctx.reg_alloc.ScratchXmm();
code.movdqa(xmm_const, code.Const(xword, 0xb4e5b025'f7011641, 0x00000001'DB710641));
@@ -85,12 +82,12 @@ static void EmitCRC32ISO(BlockOfCode& code, EmitContext& ctx, IR::Inst* inst, co
code.pextrd(crc, xmm_value, 2);
ctx.reg_alloc.DefineValue(code, inst, crc);
ctx.reg_alloc.DefineValue(inst, crc);
} else if (code.HasHostFeature(HostFeature::PCLMULQDQ) && data_size == 64) {
const Xbyak::Reg32 crc = ctx.reg_alloc.UseScratchGpr(code, args[0]).cvt32();
const Xbyak::Reg64 value = ctx.reg_alloc.UseGpr(code, args[1]);
const Xbyak::Xmm xmm_value = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm xmm_const = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Reg32 crc = ctx.reg_alloc.UseScratchGpr(args[0]).cvt32();
const Xbyak::Reg64 value = ctx.reg_alloc.UseGpr(args[1]);
const Xbyak::Xmm xmm_value = ctx.reg_alloc.ScratchXmm();
const Xbyak::Xmm xmm_const = ctx.reg_alloc.ScratchXmm();
code.movdqa(xmm_const, code.Const(xword, 0xb4e5b025'f7011641, 0x00000001'DB710641));
@@ -103,9 +100,9 @@ static void EmitCRC32ISO(BlockOfCode& code, EmitContext& ctx, IR::Inst* inst, co
code.pextrd(crc, xmm_value, 2);
ctx.reg_alloc.DefineValue(code, inst, crc);
ctx.reg_alloc.DefineValue(inst, crc);
} else {
ctx.reg_alloc.HostCall(code, inst, args[0], args[1], {});
ctx.reg_alloc.HostCall(inst, args[0], args[1], {});
code.mov(code.ABI_PARAM3, data_size / CHAR_BIT);
code.CallFunction(&CRC32::ComputeCRC32ISO);
}

View File

@@ -54,14 +54,14 @@ void AxxEmitX64::EmitMemoryRead(AxxEmitContext& ctx, IR::Inst* inst) {
if (!conf.page_table && !fastmem_marker) {
// Neither fastmem nor page table: Use callbacks
if constexpr (bitsize == 128) {
ctx.reg_alloc.HostCall(code, nullptr, {}, args[1]);
ctx.reg_alloc.HostCall(nullptr, {}, args[1]);
if (ordered) {
code.mfence();
}
code.CallFunction(memory_read_128);
ctx.reg_alloc.DefineValue(code, inst, xmm1);
ctx.reg_alloc.DefineValue(inst, xmm1);
} else {
ctx.reg_alloc.HostCall(code, inst, {}, args[1]);
ctx.reg_alloc.HostCall(inst, {}, args[1]);
if (ordered) {
code.mfence();
}
@@ -74,14 +74,14 @@ void AxxEmitX64::EmitMemoryRead(AxxEmitContext& ctx, IR::Inst* inst) {
if (ordered && bitsize == 128) {
// Required for atomic 128-bit loads/stores
ctx.reg_alloc.ScratchGpr(code, HostLoc::RAX);
ctx.reg_alloc.ScratchGpr(code, HostLoc::RBX);
ctx.reg_alloc.ScratchGpr(code, HostLoc::RCX);
ctx.reg_alloc.ScratchGpr(code, HostLoc::RDX);
ctx.reg_alloc.ScratchGpr(HostLoc::RAX);
ctx.reg_alloc.ScratchGpr(HostLoc::RBX);
ctx.reg_alloc.ScratchGpr(HostLoc::RCX);
ctx.reg_alloc.ScratchGpr(HostLoc::RDX);
}
const Xbyak::Reg64 vaddr = ctx.reg_alloc.UseGpr(code, args[1]);
const int value_idx = bitsize == 128 ? ctx.reg_alloc.ScratchXmm(code).getIdx() : ctx.reg_alloc.ScratchGpr(code).getIdx();
const Xbyak::Reg64 vaddr = ctx.reg_alloc.UseGpr(args[1]);
const int value_idx = bitsize == 128 ? ctx.reg_alloc.ScratchXmm().getIdx() : ctx.reg_alloc.ScratchGpr().getIdx();
const auto wrapped_fn = read_fallbacks[std::make_tuple(ordered, bitsize, vaddr.getIdx(), value_idx)];
@@ -126,9 +126,9 @@ void AxxEmitX64::EmitMemoryRead(AxxEmitContext& ctx, IR::Inst* inst) {
code.L(*end);
if constexpr (bitsize == 128) {
ctx.reg_alloc.DefineValue(code, inst, Xbyak::Xmm{value_idx});
ctx.reg_alloc.DefineValue(inst, Xbyak::Xmm{value_idx});
} else {
ctx.reg_alloc.DefineValue(code, inst, Xbyak::Reg64{value_idx});
ctx.reg_alloc.DefineValue(inst, Xbyak::Reg64{value_idx});
}
}
@@ -141,13 +141,13 @@ void AxxEmitX64::EmitMemoryWrite(AxxEmitContext& ctx, IR::Inst* inst) {
if (!conf.page_table && !fastmem_marker) {
// Neither fastmem nor page table: Use callbacks
if constexpr (bitsize == 128) {
ctx.reg_alloc.Use(code, args[1], ABI_PARAM2);
ctx.reg_alloc.Use(code, args[2], HostLoc::XMM1);
ctx.reg_alloc.Use(args[1], ABI_PARAM2);
ctx.reg_alloc.Use(args[2], HostLoc::XMM1);
ctx.reg_alloc.EndOfAllocScope();
ctx.reg_alloc.HostCall(code, nullptr);
ctx.reg_alloc.HostCall(nullptr);
code.CallFunction(memory_write_128);
} else {
ctx.reg_alloc.HostCall(code, nullptr, {}, args[1], args[2]);
ctx.reg_alloc.HostCall(nullptr, {}, args[1], args[2]);
Devirtualize<callback>(conf.callbacks).EmitCall(code);
}
if (ordered) {
@@ -159,16 +159,16 @@ void AxxEmitX64::EmitMemoryWrite(AxxEmitContext& ctx, IR::Inst* inst) {
if (ordered && bitsize == 128) {
// Required for atomic 128-bit loads/stores
ctx.reg_alloc.ScratchGpr(code, HostLoc::RAX);
ctx.reg_alloc.ScratchGpr(code, HostLoc::RBX);
ctx.reg_alloc.ScratchGpr(code, HostLoc::RCX);
ctx.reg_alloc.ScratchGpr(code, HostLoc::RDX);
ctx.reg_alloc.ScratchGpr(HostLoc::RAX);
ctx.reg_alloc.ScratchGpr(HostLoc::RBX);
ctx.reg_alloc.ScratchGpr(HostLoc::RCX);
ctx.reg_alloc.ScratchGpr(HostLoc::RDX);
}
const Xbyak::Reg64 vaddr = ctx.reg_alloc.UseGpr(code, args[1]);
const Xbyak::Reg64 vaddr = ctx.reg_alloc.UseGpr(args[1]);
const int value_idx = bitsize == 128
? ctx.reg_alloc.UseXmm(code, args[2]).getIdx()
: (ordered ? ctx.reg_alloc.UseScratchGpr(code, args[2]).getIdx() : ctx.reg_alloc.UseGpr(code, args[2]).getIdx());
? ctx.reg_alloc.UseXmm(args[2]).getIdx()
: (ordered ? ctx.reg_alloc.UseScratchGpr(args[2]).getIdx() : ctx.reg_alloc.UseGpr(args[2]).getIdx());
const auto wrapped_fn = write_fallbacks[std::make_tuple(ordered, bitsize, vaddr.getIdx(), value_idx)];
@@ -222,7 +222,7 @@ void AxxEmitX64::EmitExclusiveReadMemory(AxxEmitContext& ctx, IR::Inst* inst) {
if constexpr (bitsize != 128) {
using T = mcl::unsigned_integer_of_size<bitsize>;
ctx.reg_alloc.HostCall(code, inst, {}, args[1]);
ctx.reg_alloc.HostCall(inst, {}, args[1]);
code.mov(code.byte[code.ABI_JIT_PTR + offsetof(AxxJitState, exclusive_state)], u8(1));
code.mov(code.ABI_PARAM1, reinterpret_cast<u64>(&conf));
@@ -237,14 +237,14 @@ void AxxEmitX64::EmitExclusiveReadMemory(AxxEmitContext& ctx, IR::Inst* inst) {
});
code.ZeroExtendFrom(bitsize, code.ABI_RETURN);
} else {
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm(code);
ctx.reg_alloc.Use(code, args[1], ABI_PARAM2);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm();
ctx.reg_alloc.Use(args[1], ABI_PARAM2);
ctx.reg_alloc.EndOfAllocScope();
ctx.reg_alloc.HostCall(code, nullptr);
ctx.reg_alloc.HostCall(nullptr);
code.mov(code.byte[code.ABI_JIT_PTR + offsetof(AxxJitState, exclusive_state)], u8(1));
code.mov(code.ABI_PARAM1, reinterpret_cast<u64>(&conf));
ctx.reg_alloc.AllocStackSpace(code, 16 + ABI_SHADOW_SPACE);
ctx.reg_alloc.AllocStackSpace(16 + ABI_SHADOW_SPACE);
code.lea(code.ABI_PARAM3, ptr[rsp + ABI_SHADOW_SPACE]);
if (ordered) {
code.mfence();
@@ -256,9 +256,9 @@ void AxxEmitX64::EmitExclusiveReadMemory(AxxEmitContext& ctx, IR::Inst* inst) {
});
});
code.movups(result, xword[rsp + ABI_SHADOW_SPACE]);
ctx.reg_alloc.ReleaseStackSpace(code, 16 + ABI_SHADOW_SPACE);
ctx.reg_alloc.ReleaseStackSpace(16 + ABI_SHADOW_SPACE);
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
EmitCheckMemoryAbort(ctx, inst);
@@ -271,15 +271,15 @@ void AxxEmitX64::EmitExclusiveWriteMemory(AxxEmitContext& ctx, IR::Inst* inst) {
const bool ordered = IsOrdered(args[3].GetImmediateAccType());
if constexpr (bitsize == 128) {
ctx.reg_alloc.Use(code, args[1], ABI_PARAM2);
ctx.reg_alloc.Use(code, args[2], HostLoc::XMM1);
ctx.reg_alloc.Use(args[1], ABI_PARAM2);
ctx.reg_alloc.Use(args[2], HostLoc::XMM1);
ctx.reg_alloc.EndOfAllocScope();
ctx.reg_alloc.HostCall(code, inst);
ctx.reg_alloc.HostCall(inst);
} else {
ctx.reg_alloc.HostCall(code, inst, {}, args[1], args[2]);
ctx.reg_alloc.HostCall(inst, {}, args[1], args[2]);
}
const Xbyak::Reg64 tmp = ctx.reg_alloc.ScratchGpr(code);
const Xbyak::Reg64 tmp = ctx.reg_alloc.ScratchGpr();
Xbyak::Label end;
code.mov(code.ABI_RETURN, u32(1));
code.movzx(tmp.cvt32(), code.byte[code.ABI_JIT_PTR + offsetof(AxxJitState, exclusive_state)]);
@@ -299,7 +299,7 @@ void AxxEmitX64::EmitExclusiveWriteMemory(AxxEmitContext& ctx, IR::Inst* inst) {
code.mfence();
}
} else {
ctx.reg_alloc.AllocStackSpace(code, 16 + ABI_SHADOW_SPACE);
ctx.reg_alloc.AllocStackSpace(16 + ABI_SHADOW_SPACE);
code.lea(code.ABI_PARAM3, ptr[rsp + ABI_SHADOW_SPACE]);
code.movaps(xword[code.ABI_PARAM3], xmm1);
code.CallLambda([](AxxUserConfig& conf, Axx::VAddr vaddr, Vector& value) -> u32 {
@@ -310,7 +310,7 @@ void AxxEmitX64::EmitExclusiveWriteMemory(AxxEmitContext& ctx, IR::Inst* inst) {
if (ordered) {
code.mfence();
}
ctx.reg_alloc.ReleaseStackSpace(code, 16 + ABI_SHADOW_SPACE);
ctx.reg_alloc.ReleaseStackSpace(16 + ABI_SHADOW_SPACE);
}
code.L(end);
@@ -330,16 +330,16 @@ void AxxEmitX64::EmitExclusiveReadMemoryInline(AxxEmitContext& ctx, IR::Inst* in
if constexpr (ordered && bitsize == 128) {
// Required for atomic 128-bit loads/stores
ctx.reg_alloc.ScratchGpr(code, HostLoc::RAX);
ctx.reg_alloc.ScratchGpr(code, HostLoc::RBX);
ctx.reg_alloc.ScratchGpr(code, HostLoc::RCX);
ctx.reg_alloc.ScratchGpr(code, HostLoc::RDX);
ctx.reg_alloc.ScratchGpr(HostLoc::RAX);
ctx.reg_alloc.ScratchGpr(HostLoc::RBX);
ctx.reg_alloc.ScratchGpr(HostLoc::RCX);
ctx.reg_alloc.ScratchGpr(HostLoc::RDX);
}
const Xbyak::Reg64 vaddr = ctx.reg_alloc.UseGpr(code, args[1]);
const int value_idx = bitsize == 128 ? ctx.reg_alloc.ScratchXmm(code).getIdx() : ctx.reg_alloc.ScratchGpr(code).getIdx();
const Xbyak::Reg64 tmp = ctx.reg_alloc.ScratchGpr(code);
const Xbyak::Reg64 tmp2 = ctx.reg_alloc.ScratchGpr(code);
const Xbyak::Reg64 vaddr = ctx.reg_alloc.UseGpr(args[1]);
const int value_idx = bitsize == 128 ? ctx.reg_alloc.ScratchXmm().getIdx() : ctx.reg_alloc.ScratchGpr().getIdx();
const Xbyak::Reg64 tmp = ctx.reg_alloc.ScratchGpr();
const Xbyak::Reg64 tmp2 = ctx.reg_alloc.ScratchGpr();
const auto wrapped_fn = read_fallbacks[std::make_tuple(ordered, bitsize, vaddr.getIdx(), value_idx)];
@@ -386,9 +386,9 @@ void AxxEmitX64::EmitExclusiveReadMemoryInline(AxxEmitContext& ctx, IR::Inst* in
EmitExclusiveUnlock(code, conf, tmp, tmp2.cvt32());
if constexpr (bitsize == 128) {
ctx.reg_alloc.DefineValue(code, inst, Xbyak::Xmm{value_idx});
ctx.reg_alloc.DefineValue(inst, Xbyak::Xmm{value_idx});
} else {
ctx.reg_alloc.DefineValue(code, inst, Xbyak::Reg64{value_idx});
ctx.reg_alloc.DefineValue(inst, Xbyak::Reg64{value_idx});
}
EmitCheckMemoryAbort(ctx, inst);
@@ -407,19 +407,19 @@ void AxxEmitX64::EmitExclusiveWriteMemoryInline(AxxEmitContext& ctx, IR::Inst* i
const auto value = [&] {
if constexpr (bitsize == 128) {
ctx.reg_alloc.ScratchGpr(code, HostLoc::RAX);
ctx.reg_alloc.ScratchGpr(code, HostLoc::RBX);
ctx.reg_alloc.ScratchGpr(code, HostLoc::RCX);
ctx.reg_alloc.ScratchGpr(code, HostLoc::RDX);
return ctx.reg_alloc.UseXmm(code, args[2]);
ctx.reg_alloc.ScratchGpr(HostLoc::RAX);
ctx.reg_alloc.ScratchGpr(HostLoc::RBX);
ctx.reg_alloc.ScratchGpr(HostLoc::RCX);
ctx.reg_alloc.ScratchGpr(HostLoc::RDX);
return ctx.reg_alloc.UseXmm(args[2]);
} else {
ctx.reg_alloc.ScratchGpr(code, HostLoc::RAX);
return ctx.reg_alloc.UseGpr(code, args[2]);
ctx.reg_alloc.ScratchGpr(HostLoc::RAX);
return ctx.reg_alloc.UseGpr(args[2]);
}
}();
const Xbyak::Reg64 vaddr = ctx.reg_alloc.UseGpr(code, args[1]);
const Xbyak::Reg32 status = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg64 tmp = ctx.reg_alloc.ScratchGpr(code);
const Xbyak::Reg64 vaddr = ctx.reg_alloc.UseGpr(args[1]);
const Xbyak::Reg32 status = ctx.reg_alloc.ScratchGpr().cvt32();
const Xbyak::Reg64 tmp = ctx.reg_alloc.ScratchGpr();
const auto wrapped_fn = exclusive_write_fallbacks[std::make_tuple(ordered, bitsize, vaddr.getIdx(), value.getIdx())];
@@ -518,7 +518,7 @@ void AxxEmitX64::EmitExclusiveWriteMemoryInline(AxxEmitContext& ctx, IR::Inst* i
code.L(*end);
EmitExclusiveUnlock(code, conf, tmp, eax);
ctx.reg_alloc.DefineValue(code, inst, status);
ctx.reg_alloc.DefineValue(inst, status);
EmitCheckMemoryAbort(ctx, inst);
}

View File

@@ -75,8 +75,8 @@ Xbyak::RegExp EmitVAddrLookup(BlockOfCode& code, EmitContext& ctx, size_t bitsiz
template<>
[[maybe_unused]] Xbyak::RegExp EmitVAddrLookup<A32EmitContext>(BlockOfCode& code, A32EmitContext& ctx, size_t bitsize, Xbyak::Label& abort, Xbyak::Reg64 vaddr) {
const Xbyak::Reg64 page = ctx.reg_alloc.ScratchGpr(code);
const Xbyak::Reg32 tmp = ctx.conf.absolute_offset_page_table ? page.cvt32() : ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg64 page = ctx.reg_alloc.ScratchGpr();
const Xbyak::Reg32 tmp = ctx.conf.absolute_offset_page_table ? page.cvt32() : ctx.reg_alloc.ScratchGpr().cvt32();
EmitDetectMisalignedVAddr(code, ctx, bitsize, abort, vaddr, tmp.cvt64());
@@ -105,8 +105,8 @@ template<>
const size_t valid_page_index_bits = ctx.conf.page_table_address_space_bits - page_bits;
const size_t unused_top_bits = 64 - ctx.conf.page_table_address_space_bits;
const Xbyak::Reg64 page = ctx.reg_alloc.ScratchGpr(code);
const Xbyak::Reg64 tmp = ctx.conf.absolute_offset_page_table ? page : ctx.reg_alloc.ScratchGpr(code);
const Xbyak::Reg64 page = ctx.reg_alloc.ScratchGpr();
const Xbyak::Reg64 tmp = ctx.conf.absolute_offset_page_table ? page : ctx.reg_alloc.ScratchGpr();
EmitDetectMisalignedVAddr(code, ctx, bitsize, abort, vaddr, tmp);
@@ -116,7 +116,7 @@ template<>
} else if (ctx.conf.silently_mirror_page_table) {
if (valid_page_index_bits >= 32) {
if (code.HasHostFeature(HostFeature::BMI2)) {
const Xbyak::Reg64 bit_count = ctx.reg_alloc.ScratchGpr(code);
const Xbyak::Reg64 bit_count = ctx.reg_alloc.ScratchGpr();
code.mov(bit_count, unused_top_bits);
code.bzhi(tmp, vaddr, bit_count);
code.shr(tmp, int(page_bits));
@@ -168,7 +168,7 @@ template<>
return r13 + vaddr;
} else if (ctx.conf.silently_mirror_fastmem) {
if (!tmp) {
tmp = ctx.reg_alloc.ScratchGpr(code);
tmp = ctx.reg_alloc.ScratchGpr();
}
if (unused_top_bits < 32) {
code.mov(*tmp, vaddr);
@@ -189,7 +189,7 @@ template<>
} else {
// TODO: Consider having TEST as above but coalesce 64-bit constant in register allocator
if (!tmp) {
tmp = ctx.reg_alloc.ScratchGpr(code);
tmp = ctx.reg_alloc.ScratchGpr();
}
code.mov(*tmp, vaddr);
code.shr(*tmp, int(ctx.conf.fastmem_address_space_bits));

View File

@@ -1,6 +1,3 @@
// SPDX-FileCopyrightText: Copyright 2025 Eden Emulator Project
// SPDX-License-Identifier: GPL-3.0-or-later
/* This file is part of the dynarmic project.
* Copyright (c) 2016 MerryMage
* SPDX-License-Identifier: 0BSD
@@ -19,14 +16,14 @@ void EmitX64::EmitPackedAddU8(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const auto ge_inst = inst->GetAssociatedPseudoOperation(IR::Opcode::GetGEFromOp);
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseXmm(args[1]);
code.paddb(xmm_a, xmm_b);
if (ge_inst) {
const Xbyak::Xmm xmm_ge = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm ones = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm xmm_ge = ctx.reg_alloc.ScratchXmm();
const Xbyak::Xmm ones = ctx.reg_alloc.ScratchXmm();
code.pcmpeqb(ones, ones);
@@ -35,21 +32,21 @@ void EmitX64::EmitPackedAddU8(EmitContext& ctx, IR::Inst* inst) {
code.pcmpeqb(xmm_ge, xmm_b);
code.pxor(xmm_ge, ones);
ctx.reg_alloc.DefineValue(code, ge_inst, xmm_ge);
ctx.reg_alloc.DefineValue(ge_inst, xmm_ge);
}
ctx.reg_alloc.DefineValue(code, inst, xmm_a);
ctx.reg_alloc.DefineValue(inst, xmm_a);
}
void EmitX64::EmitPackedAddS8(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const auto ge_inst = inst->GetAssociatedPseudoOperation(IR::Opcode::GetGEFromOp);
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseXmm(args[1]);
if (ge_inst) {
const Xbyak::Xmm xmm_ge = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm xmm_ge = ctx.reg_alloc.ScratchXmm();
code.pcmpeqb(xmm0, xmm0);
@@ -57,27 +54,27 @@ void EmitX64::EmitPackedAddS8(EmitContext& ctx, IR::Inst* inst) {
code.paddsb(xmm_ge, xmm_b);
code.pcmpgtb(xmm_ge, xmm0);
ctx.reg_alloc.DefineValue(code, ge_inst, xmm_ge);
ctx.reg_alloc.DefineValue(ge_inst, xmm_ge);
}
code.paddb(xmm_a, xmm_b);
ctx.reg_alloc.DefineValue(code, inst, xmm_a);
ctx.reg_alloc.DefineValue(inst, xmm_a);
}
void EmitX64::EmitPackedAddU16(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const auto ge_inst = inst->GetAssociatedPseudoOperation(IR::Opcode::GetGEFromOp);
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseXmm(args[1]);
code.paddw(xmm_a, xmm_b);
if (ge_inst) {
if (code.HasHostFeature(HostFeature::SSE41)) {
const Xbyak::Xmm xmm_ge = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm ones = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm xmm_ge = ctx.reg_alloc.ScratchXmm();
const Xbyak::Xmm ones = ctx.reg_alloc.ScratchXmm();
code.pcmpeqb(ones, ones);
@@ -86,10 +83,10 @@ void EmitX64::EmitPackedAddU16(EmitContext& ctx, IR::Inst* inst) {
code.pcmpeqw(xmm_ge, xmm_b);
code.pxor(xmm_ge, ones);
ctx.reg_alloc.DefineValue(code, ge_inst, xmm_ge);
ctx.reg_alloc.DefineValue(ge_inst, xmm_ge);
} else {
const Xbyak::Xmm tmp_a = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm tmp_b = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm tmp_a = ctx.reg_alloc.ScratchXmm();
const Xbyak::Xmm tmp_b = ctx.reg_alloc.ScratchXmm();
// !(b <= a+b) == b > a+b
code.movdqa(tmp_a, xmm_a);
@@ -98,22 +95,22 @@ void EmitX64::EmitPackedAddU16(EmitContext& ctx, IR::Inst* inst) {
code.paddw(tmp_b, code.Const(xword, 0x80008000));
code.pcmpgtw(tmp_b, tmp_a); // *Signed* comparison!
ctx.reg_alloc.DefineValue(code, ge_inst, tmp_b);
ctx.reg_alloc.DefineValue(ge_inst, tmp_b);
}
}
ctx.reg_alloc.DefineValue(code, inst, xmm_a);
ctx.reg_alloc.DefineValue(inst, xmm_a);
}
void EmitX64::EmitPackedAddS16(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const auto ge_inst = inst->GetAssociatedPseudoOperation(IR::Opcode::GetGEFromOp);
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseXmm(args[1]);
if (ge_inst) {
const Xbyak::Xmm xmm_ge = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm xmm_ge = ctx.reg_alloc.ScratchXmm();
code.pcmpeqw(xmm0, xmm0);
@@ -121,45 +118,45 @@ void EmitX64::EmitPackedAddS16(EmitContext& ctx, IR::Inst* inst) {
code.paddsw(xmm_ge, xmm_b);
code.pcmpgtw(xmm_ge, xmm0);
ctx.reg_alloc.DefineValue(code, ge_inst, xmm_ge);
ctx.reg_alloc.DefineValue(ge_inst, xmm_ge);
}
code.paddw(xmm_a, xmm_b);
ctx.reg_alloc.DefineValue(code, inst, xmm_a);
ctx.reg_alloc.DefineValue(inst, xmm_a);
}
void EmitX64::EmitPackedSubU8(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const auto ge_inst = inst->GetAssociatedPseudoOperation(IR::Opcode::GetGEFromOp);
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseXmm(args[1]);
if (ge_inst) {
const Xbyak::Xmm xmm_ge = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm xmm_ge = ctx.reg_alloc.ScratchXmm();
code.movdqa(xmm_ge, xmm_a);
code.pmaxub(xmm_ge, xmm_b);
code.pcmpeqb(xmm_ge, xmm_a);
ctx.reg_alloc.DefineValue(code, ge_inst, xmm_ge);
ctx.reg_alloc.DefineValue(ge_inst, xmm_ge);
}
code.psubb(xmm_a, xmm_b);
ctx.reg_alloc.DefineValue(code, inst, xmm_a);
ctx.reg_alloc.DefineValue(inst, xmm_a);
}
void EmitX64::EmitPackedSubS8(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const auto ge_inst = inst->GetAssociatedPseudoOperation(IR::Opcode::GetGEFromOp);
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseXmm(args[1]);
if (ge_inst) {
const Xbyak::Xmm xmm_ge = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm xmm_ge = ctx.reg_alloc.ScratchXmm();
code.pcmpeqb(xmm0, xmm0);
@@ -167,12 +164,12 @@ void EmitX64::EmitPackedSubS8(EmitContext& ctx, IR::Inst* inst) {
code.psubsb(xmm_ge, xmm_b);
code.pcmpgtb(xmm_ge, xmm0);
ctx.reg_alloc.DefineValue(code, ge_inst, xmm_ge);
ctx.reg_alloc.DefineValue(ge_inst, xmm_ge);
}
code.psubb(xmm_a, xmm_b);
ctx.reg_alloc.DefineValue(code, inst, xmm_a);
ctx.reg_alloc.DefineValue(inst, xmm_a);
}
void EmitX64::EmitPackedSubU16(EmitContext& ctx, IR::Inst* inst) {
@@ -180,19 +177,19 @@ void EmitX64::EmitPackedSubU16(EmitContext& ctx, IR::Inst* inst) {
const auto ge_inst = inst->GetAssociatedPseudoOperation(IR::Opcode::GetGEFromOp);
if (!ge_inst) {
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseXmm(args[1]);
code.psubw(xmm_a, xmm_b);
ctx.reg_alloc.DefineValue(code, inst, xmm_a);
ctx.reg_alloc.DefineValue(inst, xmm_a);
return;
}
if (code.HasHostFeature(HostFeature::SSE41)) {
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm xmm_ge = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseXmm(args[1]);
const Xbyak::Xmm xmm_ge = ctx.reg_alloc.ScratchXmm();
code.movdqa(xmm_ge, xmm_a);
code.pmaxuw(xmm_ge, xmm_b); // Requires SSE 4.1
@@ -200,15 +197,15 @@ void EmitX64::EmitPackedSubU16(EmitContext& ctx, IR::Inst* inst) {
code.psubw(xmm_a, xmm_b);
ctx.reg_alloc.DefineValue(code, ge_inst, xmm_ge);
ctx.reg_alloc.DefineValue(code, inst, xmm_a);
ctx.reg_alloc.DefineValue(ge_inst, xmm_ge);
ctx.reg_alloc.DefineValue(inst, xmm_a);
return;
}
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseScratchXmm(code, args[1]);
const Xbyak::Xmm xmm_ge = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm ones = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseScratchXmm(args[1]);
const Xbyak::Xmm xmm_ge = ctx.reg_alloc.ScratchXmm();
const Xbyak::Xmm ones = ctx.reg_alloc.ScratchXmm();
// (a >= b) == !(b > a)
code.pcmpeqb(ones, ones);
@@ -220,19 +217,19 @@ void EmitX64::EmitPackedSubU16(EmitContext& ctx, IR::Inst* inst) {
code.psubw(xmm_a, xmm_b);
ctx.reg_alloc.DefineValue(code, ge_inst, xmm_ge);
ctx.reg_alloc.DefineValue(code, inst, xmm_a);
ctx.reg_alloc.DefineValue(ge_inst, xmm_ge);
ctx.reg_alloc.DefineValue(inst, xmm_a);
}
void EmitX64::EmitPackedSubS16(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const auto ge_inst = inst->GetAssociatedPseudoOperation(IR::Opcode::GetGEFromOp);
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseXmm(args[1]);
if (ge_inst) {
const Xbyak::Xmm xmm_ge = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm xmm_ge = ctx.reg_alloc.ScratchXmm();
code.pcmpeqw(xmm0, xmm0);
@@ -240,21 +237,21 @@ void EmitX64::EmitPackedSubS16(EmitContext& ctx, IR::Inst* inst) {
code.psubsw(xmm_ge, xmm_b);
code.pcmpgtw(xmm_ge, xmm0);
ctx.reg_alloc.DefineValue(code, ge_inst, xmm_ge);
ctx.reg_alloc.DefineValue(ge_inst, xmm_ge);
}
code.psubw(xmm_a, xmm_b);
ctx.reg_alloc.DefineValue(code, inst, xmm_a);
ctx.reg_alloc.DefineValue(inst, xmm_a);
}
void EmitX64::EmitPackedHalvingAddU8(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
if (args[0].IsInXmm(ctx.reg_alloc) || args[1].IsInXmm(ctx.reg_alloc)) {
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseScratchXmm(code, args[1]);
const Xbyak::Xmm ones = ctx.reg_alloc.ScratchXmm(code);
if (args[0].IsInXmm() || args[1].IsInXmm()) {
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseScratchXmm(args[1]);
const Xbyak::Xmm ones = ctx.reg_alloc.ScratchXmm();
// Since,
// pavg(a, b) == (a + b + 1) >> 1
@@ -267,11 +264,11 @@ void EmitX64::EmitPackedHalvingAddU8(EmitContext& ctx, IR::Inst* inst) {
code.pavgb(xmm_a, xmm_b);
code.pxor(xmm_a, ones);
ctx.reg_alloc.DefineValue(code, inst, xmm_a);
ctx.reg_alloc.DefineValue(inst, xmm_a);
} else {
const Xbyak::Reg32 reg_a = ctx.reg_alloc.UseScratchGpr(code, args[0]).cvt32();
const Xbyak::Reg32 reg_b = ctx.reg_alloc.UseGpr(code, args[1]).cvt32();
const Xbyak::Reg32 xor_a_b = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 reg_a = ctx.reg_alloc.UseScratchGpr(args[0]).cvt32();
const Xbyak::Reg32 reg_b = ctx.reg_alloc.UseGpr(args[1]).cvt32();
const Xbyak::Reg32 xor_a_b = ctx.reg_alloc.ScratchGpr().cvt32();
const Xbyak::Reg32 and_a_b = reg_a;
const Xbyak::Reg32 result = reg_a;
@@ -287,17 +284,17 @@ void EmitX64::EmitPackedHalvingAddU8(EmitContext& ctx, IR::Inst* inst) {
code.and_(xor_a_b, 0x7F7F7F7F);
code.add(result, xor_a_b);
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
}
void EmitX64::EmitPackedHalvingAddU16(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
if (args[0].IsInXmm(ctx.reg_alloc) || args[1].IsInXmm(ctx.reg_alloc)) {
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm tmp = ctx.reg_alloc.ScratchXmm(code);
if (args[0].IsInXmm() || args[1].IsInXmm()) {
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseXmm(args[1]);
const Xbyak::Xmm tmp = ctx.reg_alloc.ScratchXmm();
code.movdqa(tmp, xmm_a);
code.pand(xmm_a, xmm_b);
@@ -305,11 +302,11 @@ void EmitX64::EmitPackedHalvingAddU16(EmitContext& ctx, IR::Inst* inst) {
code.psrlw(tmp, 1);
code.paddw(xmm_a, tmp);
ctx.reg_alloc.DefineValue(code, inst, xmm_a);
ctx.reg_alloc.DefineValue(inst, xmm_a);
} else {
const Xbyak::Reg32 reg_a = ctx.reg_alloc.UseScratchGpr(code, args[0]).cvt32();
const Xbyak::Reg32 reg_b = ctx.reg_alloc.UseGpr(code, args[1]).cvt32();
const Xbyak::Reg32 xor_a_b = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 reg_a = ctx.reg_alloc.UseScratchGpr(args[0]).cvt32();
const Xbyak::Reg32 reg_b = ctx.reg_alloc.UseGpr(args[1]).cvt32();
const Xbyak::Reg32 xor_a_b = ctx.reg_alloc.ScratchGpr().cvt32();
const Xbyak::Reg32 and_a_b = reg_a;
const Xbyak::Reg32 result = reg_a;
@@ -325,19 +322,19 @@ void EmitX64::EmitPackedHalvingAddU16(EmitContext& ctx, IR::Inst* inst) {
code.and_(xor_a_b, 0x7FFF7FFF);
code.add(result, xor_a_b);
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
}
void EmitX64::EmitPackedHalvingAddS8(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Reg32 reg_a = ctx.reg_alloc.UseScratchGpr(code, args[0]).cvt32();
const Xbyak::Reg32 reg_b = ctx.reg_alloc.UseGpr(code, args[1]).cvt32();
const Xbyak::Reg32 xor_a_b = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 reg_a = ctx.reg_alloc.UseScratchGpr(args[0]).cvt32();
const Xbyak::Reg32 reg_b = ctx.reg_alloc.UseGpr(args[1]).cvt32();
const Xbyak::Reg32 xor_a_b = ctx.reg_alloc.ScratchGpr().cvt32();
const Xbyak::Reg32 and_a_b = reg_a;
const Xbyak::Reg32 result = reg_a;
const Xbyak::Reg32 carry = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 carry = ctx.reg_alloc.ScratchGpr().cvt32();
// This relies on the equality x+y == ((x&y) << 1) + (x^y).
// Note that x^y always contains the LSB of the result.
@@ -355,15 +352,15 @@ void EmitX64::EmitPackedHalvingAddS8(EmitContext& ctx, IR::Inst* inst) {
code.add(result, xor_a_b);
code.xor_(result, carry);
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
void EmitX64::EmitPackedHalvingAddS16(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm tmp = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseXmm(args[1]);
const Xbyak::Xmm tmp = ctx.reg_alloc.ScratchXmm();
// This relies on the equality x+y == ((x&y) << 1) + (x^y).
// Note that x^y always contains the LSB of the result.
@@ -376,14 +373,14 @@ void EmitX64::EmitPackedHalvingAddS16(EmitContext& ctx, IR::Inst* inst) {
code.psraw(tmp, 1);
code.paddw(xmm_a, tmp);
ctx.reg_alloc.DefineValue(code, inst, xmm_a);
ctx.reg_alloc.DefineValue(inst, xmm_a);
}
void EmitX64::EmitPackedHalvingSubU8(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Reg32 minuend = ctx.reg_alloc.UseScratchGpr(code, args[0]).cvt32();
const Xbyak::Reg32 subtrahend = ctx.reg_alloc.UseScratchGpr(code, args[1]).cvt32();
const Xbyak::Reg32 minuend = ctx.reg_alloc.UseScratchGpr(args[0]).cvt32();
const Xbyak::Reg32 subtrahend = ctx.reg_alloc.UseScratchGpr(args[1]).cvt32();
// This relies on the equality x-y == (x^y) - (((x^y)&y) << 1).
// Note that x^y always contains the LSB of the result.
@@ -406,16 +403,16 @@ void EmitX64::EmitPackedHalvingSubU8(EmitContext& ctx, IR::Inst* inst) {
code.xor_(minuend, 0x80808080);
// minuend now contains the desired result.
ctx.reg_alloc.DefineValue(code, inst, minuend);
ctx.reg_alloc.DefineValue(inst, minuend);
}
void EmitX64::EmitPackedHalvingSubS8(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Reg32 minuend = ctx.reg_alloc.UseScratchGpr(code, args[0]).cvt32();
const Xbyak::Reg32 subtrahend = ctx.reg_alloc.UseScratchGpr(code, args[1]).cvt32();
const Xbyak::Reg32 minuend = ctx.reg_alloc.UseScratchGpr(args[0]).cvt32();
const Xbyak::Reg32 subtrahend = ctx.reg_alloc.UseScratchGpr(args[1]).cvt32();
const Xbyak::Reg32 carry = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 carry = ctx.reg_alloc.ScratchGpr().cvt32();
// This relies on the equality x-y == (x^y) - (((x^y)&y) << 1).
// Note that x^y always contains the LSB of the result.
@@ -442,14 +439,14 @@ void EmitX64::EmitPackedHalvingSubS8(EmitContext& ctx, IR::Inst* inst) {
code.xor_(minuend, 0x80808080);
code.xor_(minuend, carry);
ctx.reg_alloc.DefineValue(code, inst, minuend);
ctx.reg_alloc.DefineValue(inst, minuend);
}
void EmitX64::EmitPackedHalvingSubU16(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Xmm minuend = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm subtrahend = ctx.reg_alloc.UseScratchXmm(code, args[1]);
const Xbyak::Xmm minuend = ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm subtrahend = ctx.reg_alloc.UseScratchXmm(args[1]);
// This relies on the equality x-y == (x^y) - (((x^y)&y) << 1).
// Note that x^y always contains the LSB of the result.
@@ -465,14 +462,14 @@ void EmitX64::EmitPackedHalvingSubU16(EmitContext& ctx, IR::Inst* inst) {
code.psubw(minuend, subtrahend);
ctx.reg_alloc.DefineValue(code, inst, minuend);
ctx.reg_alloc.DefineValue(inst, minuend);
}
void EmitX64::EmitPackedHalvingSubS16(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Xmm minuend = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm subtrahend = ctx.reg_alloc.UseScratchXmm(code, args[1]);
const Xbyak::Xmm minuend = ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm subtrahend = ctx.reg_alloc.UseScratchXmm(args[1]);
// This relies on the equality x-y == (x^y) - (((x^y)&y) << 1).
// Note that x^y always contains the LSB of the result.
@@ -488,17 +485,17 @@ void EmitX64::EmitPackedHalvingSubS16(EmitContext& ctx, IR::Inst* inst) {
code.psubw(minuend, subtrahend);
ctx.reg_alloc.DefineValue(code, inst, minuend);
ctx.reg_alloc.DefineValue(inst, minuend);
}
static void EmitPackedSubAdd(BlockOfCode& code, EmitContext& ctx, IR::Inst* inst, bool hi_is_sum, bool is_signed, bool is_halving) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const auto ge_inst = inst->GetAssociatedPseudoOperation(IR::Opcode::GetGEFromOp);
const Xbyak::Reg32 reg_a_hi = ctx.reg_alloc.UseScratchGpr(code, args[0]).cvt32();
const Xbyak::Reg32 reg_b_hi = ctx.reg_alloc.UseScratchGpr(code, args[1]).cvt32();
const Xbyak::Reg32 reg_a_lo = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 reg_b_lo = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 reg_a_hi = ctx.reg_alloc.UseScratchGpr(args[0]).cvt32();
const Xbyak::Reg32 reg_b_hi = ctx.reg_alloc.UseScratchGpr(args[1]).cvt32();
const Xbyak::Reg32 reg_a_lo = ctx.reg_alloc.ScratchGpr().cvt32();
const Xbyak::Reg32 reg_b_lo = ctx.reg_alloc.ScratchGpr().cvt32();
Xbyak::Reg32 reg_sum, reg_diff;
if (is_signed) {
@@ -546,7 +543,7 @@ static void EmitPackedSubAdd(BlockOfCode& code, EmitContext& ctx, IR::Inst* inst
code.and_(ge_diff, hi_is_sum ? 0x0000FFFF : 0xFFFF0000);
code.or_(ge_sum, ge_diff);
ctx.reg_alloc.DefineValue(code, ge_inst, ge_sum);
ctx.reg_alloc.DefineValue(ge_inst, ge_sum);
}
if (is_halving) {
@@ -560,7 +557,7 @@ static void EmitPackedSubAdd(BlockOfCode& code, EmitContext& ctx, IR::Inst* inst
// Merge them.
code.shld(reg_a_hi, reg_a_lo, 16);
ctx.reg_alloc.DefineValue(code, inst, reg_a_hi);
ctx.reg_alloc.DefineValue(inst, reg_a_hi);
}
void EmitX64::EmitPackedAddSubU16(EmitContext& ctx, IR::Inst* inst) {
@@ -598,12 +595,12 @@ void EmitX64::EmitPackedHalvingSubAddS16(EmitContext& ctx, IR::Inst* inst) {
static void EmitPackedOperation(BlockOfCode& code, EmitContext& ctx, IR::Inst* inst, void (Xbyak::CodeGenerator::*fn)(const Xbyak::Mmx& mmx, const Xbyak::Operand&)) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseXmm(args[1]);
(code.*fn)(xmm_a, xmm_b);
ctx.reg_alloc.DefineValue(code, inst, xmm_a);
ctx.reg_alloc.DefineValue(inst, xmm_a);
}
void EmitX64::EmitPackedSaturatedAddU8(EmitContext& ctx, IR::Inst* inst) {
@@ -641,9 +638,9 @@ void EmitX64::EmitPackedSaturatedSubS16(EmitContext& ctx, IR::Inst* inst) {
void EmitX64::EmitPackedAbsDiffSumU8(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseScratchXmm(code, args[1]);
const Xbyak::Xmm tmp = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseScratchXmm(args[1]);
const Xbyak::Xmm tmp = ctx.reg_alloc.ScratchXmm();
// TODO: Optimize with zero-extension detection
code.movaps(tmp, code.Const(xword, 0x0000'0000'ffff'ffff));
@@ -651,45 +648,45 @@ void EmitX64::EmitPackedAbsDiffSumU8(EmitContext& ctx, IR::Inst* inst) {
code.pand(xmm_b, tmp);
code.psadbw(xmm_a, xmm_b);
ctx.reg_alloc.DefineValue(code, inst, xmm_a);
ctx.reg_alloc.DefineValue(inst, xmm_a);
}
void EmitX64::EmitPackedSelect(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const size_t num_args_in_xmm = args[0].IsInXmm(ctx.reg_alloc) + args[1].IsInXmm(ctx.reg_alloc) + args[2].IsInXmm(ctx.reg_alloc);
const size_t num_args_in_xmm = args[0].IsInXmm() + args[1].IsInXmm() + args[2].IsInXmm();
if (num_args_in_xmm >= 2) {
const Xbyak::Xmm ge = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm to = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm from = ctx.reg_alloc.UseScratchXmm(code, args[2]);
const Xbyak::Xmm ge = ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm to = ctx.reg_alloc.UseXmm(args[1]);
const Xbyak::Xmm from = ctx.reg_alloc.UseScratchXmm(args[2]);
code.pand(from, ge);
code.pandn(ge, to);
code.por(from, ge);
ctx.reg_alloc.DefineValue(code, inst, from);
ctx.reg_alloc.DefineValue(inst, from);
} else if (code.HasHostFeature(HostFeature::BMI1)) {
const Xbyak::Reg32 ge = ctx.reg_alloc.UseGpr(code, args[0]).cvt32();
const Xbyak::Reg32 to = ctx.reg_alloc.UseScratchGpr(code, args[1]).cvt32();
const Xbyak::Reg32 from = ctx.reg_alloc.UseScratchGpr(code, args[2]).cvt32();
const Xbyak::Reg32 ge = ctx.reg_alloc.UseGpr(args[0]).cvt32();
const Xbyak::Reg32 to = ctx.reg_alloc.UseScratchGpr(args[1]).cvt32();
const Xbyak::Reg32 from = ctx.reg_alloc.UseScratchGpr(args[2]).cvt32();
code.and_(from, ge);
code.andn(to, ge, to);
code.or_(from, to);
ctx.reg_alloc.DefineValue(code, inst, from);
ctx.reg_alloc.DefineValue(inst, from);
} else {
const Xbyak::Reg32 ge = ctx.reg_alloc.UseScratchGpr(code, args[0]).cvt32();
const Xbyak::Reg32 to = ctx.reg_alloc.UseGpr(code, args[1]).cvt32();
const Xbyak::Reg32 from = ctx.reg_alloc.UseScratchGpr(code, args[2]).cvt32();
const Xbyak::Reg32 ge = ctx.reg_alloc.UseScratchGpr(args[0]).cvt32();
const Xbyak::Reg32 to = ctx.reg_alloc.UseGpr(args[1]).cvt32();
const Xbyak::Reg32 from = ctx.reg_alloc.UseScratchGpr(args[2]).cvt32();
code.and_(from, ge);
code.not_(ge);
code.and_(ge, to);
code.or_(from, ge);
ctx.reg_alloc.DefineValue(code, inst, from);
ctx.reg_alloc.DefineValue(inst, from);
}
}

View File

@@ -34,9 +34,9 @@ template<Op op, size_t size, bool has_overflow_inst = false>
void EmitSignedSaturatedOp(BlockOfCode& code, EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
Xbyak::Reg result = ctx.reg_alloc.UseScratchGpr(code, args[0]).changeBit(size);
Xbyak::Reg addend = ctx.reg_alloc.UseGpr(code, args[1]).changeBit(size);
Xbyak::Reg overflow = ctx.reg_alloc.ScratchGpr(code).changeBit(size);
Xbyak::Reg result = ctx.reg_alloc.UseScratchGpr(args[0]).changeBit(size);
Xbyak::Reg addend = ctx.reg_alloc.UseGpr(args[1]).changeBit(size);
Xbyak::Reg overflow = ctx.reg_alloc.ScratchGpr().changeBit(size);
constexpr u64 int_max = static_cast<u64>((std::numeric_limits<mcl::signed_integer_of_size<size>>::max)());
if constexpr (size < 64) {
@@ -66,21 +66,21 @@ void EmitSignedSaturatedOp(BlockOfCode& code, EmitContext& ctx, IR::Inst* inst)
code.seto(overflow.cvt8());
if constexpr (has_overflow_inst) {
if (const auto overflow_inst = inst->GetAssociatedPseudoOperation(IR::Opcode::GetOverflowFromOp)) {
ctx.reg_alloc.DefineValue(code, overflow_inst, overflow);
ctx.reg_alloc.DefineValue(overflow_inst, overflow);
}
} else {
code.or_(code.byte[code.ABI_JIT_PTR + code.GetJitStateInfo().offsetof_fpsr_qc], overflow.cvt8());
}
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
template<Op op, size_t size>
void EmitUnsignedSaturatedOp(BlockOfCode& code, EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
Xbyak::Reg op_result = ctx.reg_alloc.UseScratchGpr(code, args[0]).changeBit(size);
Xbyak::Reg addend = ctx.reg_alloc.UseScratchGpr(code, args[1]).changeBit(size);
Xbyak::Reg op_result = ctx.reg_alloc.UseScratchGpr(args[0]).changeBit(size);
Xbyak::Reg addend = ctx.reg_alloc.UseScratchGpr(args[1]).changeBit(size);
constexpr u64 boundary = op == Op::Add ? (std::numeric_limits<mcl::unsigned_integer_of_size<size>>::max)() : 0;
@@ -96,11 +96,11 @@ void EmitUnsignedSaturatedOp(BlockOfCode& code, EmitContext& ctx, IR::Inst* inst
code.cmovae(addend, op_result);
}
const Xbyak::Reg overflow = ctx.reg_alloc.ScratchGpr(code);
const Xbyak::Reg overflow = ctx.reg_alloc.ScratchGpr();
code.setb(overflow.cvt8());
code.or_(code.byte[code.ABI_JIT_PTR + code.GetJitStateInfo().offsetof_fpsr_qc], overflow.cvt8());
ctx.reg_alloc.DefineValue(code, inst, addend);
ctx.reg_alloc.DefineValue(inst, addend);
}
} // anonymous namespace
@@ -126,10 +126,10 @@ void EmitX64::EmitSignedSaturation(EmitContext& ctx, IR::Inst* inst) {
overflow_inst->ReplaceUsesWith(no_overflow);
}
// TODO: DefineValue directly on Argument
const Xbyak::Reg64 result = ctx.reg_alloc.ScratchGpr(code);
const Xbyak::Reg64 source = ctx.reg_alloc.UseGpr(code, args[0]);
const Xbyak::Reg64 result = ctx.reg_alloc.ScratchGpr();
const Xbyak::Reg64 source = ctx.reg_alloc.UseGpr(args[0]);
code.mov(result.cvt32(), source.cvt32());
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
return;
}
@@ -137,9 +137,9 @@ void EmitX64::EmitSignedSaturation(EmitContext& ctx, IR::Inst* inst) {
const u32 positive_saturated_value = (1u << (N - 1)) - 1;
const u32 negative_saturated_value = 1u << (N - 1);
const Xbyak::Reg32 result = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 reg_a = ctx.reg_alloc.UseGpr(code, args[0]).cvt32();
const Xbyak::Reg32 overflow = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 result = ctx.reg_alloc.ScratchGpr().cvt32();
const Xbyak::Reg32 reg_a = ctx.reg_alloc.UseGpr(args[0]).cvt32();
const Xbyak::Reg32 overflow = ctx.reg_alloc.ScratchGpr().cvt32();
// overflow now contains a value between 0 and mask if it was originally between {negative,positive}_saturated_value.
code.lea(overflow, code.ptr[reg_a.cvt64() + negative_saturated_value]);
@@ -156,10 +156,10 @@ void EmitX64::EmitSignedSaturation(EmitContext& ctx, IR::Inst* inst) {
if (overflow_inst) {
code.seta(overflow.cvt8());
ctx.reg_alloc.DefineValue(code, overflow_inst, overflow);
ctx.reg_alloc.DefineValue(overflow_inst, overflow);
}
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
void EmitX64::EmitUnsignedSaturation(EmitContext& ctx, IR::Inst* inst) {
@@ -171,9 +171,9 @@ void EmitX64::EmitUnsignedSaturation(EmitContext& ctx, IR::Inst* inst) {
const u32 saturated_value = (1u << N) - 1;
const Xbyak::Reg32 result = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 reg_a = ctx.reg_alloc.UseGpr(code, args[0]).cvt32();
const Xbyak::Reg32 overflow = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 result = ctx.reg_alloc.ScratchGpr().cvt32();
const Xbyak::Reg32 reg_a = ctx.reg_alloc.UseGpr(args[0]).cvt32();
const Xbyak::Reg32 overflow = ctx.reg_alloc.ScratchGpr().cvt32();
// Pseudocode: result = clamp(reg_a, 0, saturated_value);
code.xor_(overflow, overflow);
@@ -185,10 +185,10 @@ void EmitX64::EmitUnsignedSaturation(EmitContext& ctx, IR::Inst* inst) {
if (overflow_inst) {
code.seta(overflow.cvt8());
ctx.reg_alloc.DefineValue(code, overflow_inst, overflow);
ctx.reg_alloc.DefineValue(overflow_inst, overflow);
}
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
void EmitX64::EmitSignedSaturatedAdd8(EmitContext& ctx, IR::Inst* inst) {
@@ -210,9 +210,9 @@ void EmitX64::EmitSignedSaturatedAdd64(EmitContext& ctx, IR::Inst* inst) {
void EmitX64::EmitSignedSaturatedDoublingMultiplyReturnHigh16(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Reg32 x = ctx.reg_alloc.UseScratchGpr(code, args[0]).cvt32();
const Xbyak::Reg32 y = ctx.reg_alloc.UseScratchGpr(code, args[1]).cvt32();
const Xbyak::Reg32 tmp = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 x = ctx.reg_alloc.UseScratchGpr(args[0]).cvt32();
const Xbyak::Reg32 y = ctx.reg_alloc.UseScratchGpr(args[1]).cvt32();
const Xbyak::Reg32 tmp = ctx.reg_alloc.ScratchGpr().cvt32();
code.movsx(x, x.cvt16());
code.movsx(y, y.cvt16());
@@ -228,15 +228,15 @@ void EmitX64::EmitSignedSaturatedDoublingMultiplyReturnHigh16(EmitContext& ctx,
code.sets(tmp.cvt8());
code.or_(code.byte[code.ABI_JIT_PTR + code.GetJitStateInfo().offsetof_fpsr_qc], tmp.cvt8());
ctx.reg_alloc.DefineValue(code, inst, y);
ctx.reg_alloc.DefineValue(inst, y);
}
void EmitX64::EmitSignedSaturatedDoublingMultiplyReturnHigh32(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Reg64 x = ctx.reg_alloc.UseScratchGpr(code, args[0]);
const Xbyak::Reg64 y = ctx.reg_alloc.UseScratchGpr(code, args[1]);
const Xbyak::Reg64 tmp = ctx.reg_alloc.ScratchGpr(code);
const Xbyak::Reg64 x = ctx.reg_alloc.UseScratchGpr(args[0]);
const Xbyak::Reg64 y = ctx.reg_alloc.UseScratchGpr(args[1]);
const Xbyak::Reg64 tmp = ctx.reg_alloc.ScratchGpr();
code.movsxd(x, x.cvt32());
code.movsxd(y, y.cvt32());
@@ -252,7 +252,7 @@ void EmitX64::EmitSignedSaturatedDoublingMultiplyReturnHigh32(EmitContext& ctx,
code.sets(tmp.cvt8());
code.or_(code.byte[code.ABI_JIT_PTR + code.GetJitStateInfo().offsetof_fpsr_qc], tmp.cvt8());
ctx.reg_alloc.DefineValue(code, inst, y);
ctx.reg_alloc.DefineValue(inst, y);
}
void EmitX64::EmitSignedSaturatedSub8(EmitContext& ctx, IR::Inst* inst) {

View File

@@ -1,6 +1,3 @@
// SPDX-FileCopyrightText: Copyright 2025 Eden Emulator Project
// SPDX-License-Identifier: GPL-3.0-or-later
/* This file is part of the dynarmic project.
* Copyright (c) 2022 MerryMage
* SPDX-License-Identifier: 0BSD
@@ -25,9 +22,9 @@ void EmitX64::EmitSHA256Hash(EmitContext& ctx, IR::Inst* inst) {
// y = h g f e
// w = wk3 wk2 wk1 wk0
const Xbyak::Xmm x = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm y = ctx.reg_alloc.UseScratchXmm(code, args[1]);
const Xbyak::Xmm w = ctx.reg_alloc.UseXmm(code, args[2]);
const Xbyak::Xmm x = ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm y = ctx.reg_alloc.UseScratchXmm(args[1]);
const Xbyak::Xmm w = ctx.reg_alloc.UseXmm(args[2]);
// x64 expects:
// 3 2 1 0
@@ -48,7 +45,7 @@ void EmitX64::EmitSHA256Hash(EmitContext& ctx, IR::Inst* inst) {
code.shufps(y, x, part1 ? 0b10111011 : 0b00010001);
ctx.reg_alloc.DefineValue(code, inst, y);
ctx.reg_alloc.DefineValue(inst, y);
}
void EmitX64::EmitSHA256MessageSchedule0(EmitContext& ctx, IR::Inst* inst) {
@@ -56,12 +53,12 @@ void EmitX64::EmitSHA256MessageSchedule0(EmitContext& ctx, IR::Inst* inst) {
ASSERT(code.HasHostFeature(HostFeature::SHA));
const Xbyak::Xmm x = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm y = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm x = ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm y = ctx.reg_alloc.UseXmm(args[1]);
code.sha256msg1(x, y);
ctx.reg_alloc.DefineValue(code, inst, x);
ctx.reg_alloc.DefineValue(inst, x);
}
void EmitX64::EmitSHA256MessageSchedule1(EmitContext& ctx, IR::Inst* inst) {
@@ -69,16 +66,16 @@ void EmitX64::EmitSHA256MessageSchedule1(EmitContext& ctx, IR::Inst* inst) {
ASSERT(code.HasHostFeature(HostFeature::SHA));
const Xbyak::Xmm x = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm y = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm z = ctx.reg_alloc.UseXmm(code, args[2]);
const Xbyak::Xmm x = ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm y = ctx.reg_alloc.UseXmm(args[1]);
const Xbyak::Xmm z = ctx.reg_alloc.UseXmm(args[2]);
code.movaps(xmm0, z);
code.palignr(xmm0, y, 4);
code.paddd(x, xmm0);
code.sha256msg2(x, z);
ctx.reg_alloc.DefineValue(code, inst, x);
ctx.reg_alloc.DefineValue(inst, x);
}
} // namespace Dynarmic::Backend::X64

View File

@@ -1,6 +1,3 @@
// SPDX-FileCopyrightText: Copyright 2025 Eden Emulator Project
// SPDX-License-Identifier: GPL-3.0-or-later
/* This file is part of the dynarmic project.
* Copyright (c) 2018 MerryMage
* SPDX-License-Identifier: 0BSD
@@ -16,7 +13,7 @@ namespace Dynarmic::Backend::X64 {
void EmitX64::EmitSM4AccessSubstitutionBox(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
ctx.reg_alloc.HostCall(code, inst, args[0]);
ctx.reg_alloc.HostCall(inst, args[0]);
code.CallFunction(&Common::Crypto::SM4::AccessSubstitutionBox);
code.movzx(code.ABI_RETURN.cvt32(), code.ABI_RETURN.cvt8());
}

File diff suppressed because it is too large Load Diff

View File

@@ -96,7 +96,7 @@ void HandleNaNs(BlockOfCode& code, EmitContext& ctx, bool fpcr_controlled, std::
if (code.HasHostFeature(HostFeature::SSE41)) {
code.ptest(nan_mask, nan_mask);
} else {
const Xbyak::Reg32 bitmask = ctx.reg_alloc.ScratchGpr(code).cvt32();
const Xbyak::Reg32 bitmask = ctx.reg_alloc.ScratchGpr().cvt32();
code.movmskps(bitmask, nan_mask);
code.cmp(bitmask, 0);
}
@@ -312,13 +312,13 @@ void EmitTwoOpVectorOperation(BlockOfCode& code, EmitContext& ctx, IR::Inst* ins
Xbyak::Xmm result;
if constexpr (std::is_member_function_pointer_v<Function>) {
result = ctx.reg_alloc.UseScratchXmm(code, args[0]);
result = ctx.reg_alloc.UseScratchXmm(args[0]);
MaybeStandardFPSCRValue(code, ctx, fpcr_controlled, [&] {
(code.*fn)(result);
});
} else {
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseXmm(code, args[0]);
result = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseXmm(args[0]);
result = ctx.reg_alloc.ScratchXmm();
MaybeStandardFPSCRValue(code, ctx, fpcr_controlled, [&] {
fn(result, xmm_a);
});
@@ -328,13 +328,13 @@ void EmitTwoOpVectorOperation(BlockOfCode& code, EmitContext& ctx, IR::Inst* ins
ForceToDefaultNaN<fsize>(code, ctx.FPCR(fpcr_controlled), result);
}
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
return;
}
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseXmm(code, args[0]);
const Xbyak::Xmm nan_mask = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm();
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseXmm(args[0]);
const Xbyak::Xmm nan_mask = ctx.reg_alloc.ScratchXmm();
if constexpr (std::is_member_function_pointer_v<Function>) {
code.movaps(result, xmm_a);
@@ -352,7 +352,7 @@ void EmitTwoOpVectorOperation(BlockOfCode& code, EmitContext& ctx, IR::Inst* ins
HandleNaNs<fsize, 1>(code, ctx, fpcr_controlled, {result, xmm_a}, nan_mask, nan_handler);
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
enum class CheckInputNaN {
@@ -368,8 +368,8 @@ void EmitThreeOpVectorOperation(BlockOfCode& code, EmitContext& ctx, IR::Inst* i
const bool fpcr_controlled = args[2].GetImmediateU1();
if (ctx.FPCR(fpcr_controlled).DN() || ctx.HasOptimization(OptimizationFlag::Unsafe_InaccurateNaN)) {
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseXmm(args[1]);
if constexpr (std::is_member_function_pointer_v<Function>) {
MaybeStandardFPSCRValue(code, ctx, fpcr_controlled, [&] {
@@ -385,14 +385,14 @@ void EmitThreeOpVectorOperation(BlockOfCode& code, EmitContext& ctx, IR::Inst* i
ForceToDefaultNaN<fsize>(code, ctx.FPCR(fpcr_controlled), xmm_a);
}
ctx.reg_alloc.DefineValue(code, inst, xmm_a);
ctx.reg_alloc.DefineValue(inst, xmm_a);
return;
}
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseXmm(code, args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm nan_mask = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm();
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseXmm(args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseXmm(args[1]);
const Xbyak::Xmm nan_mask = ctx.reg_alloc.ScratchXmm();
code.movaps(result, xmm_a);
@@ -422,7 +422,7 @@ void EmitThreeOpVectorOperation(BlockOfCode& code, EmitContext& ctx, IR::Inst* i
HandleNaNs<fsize, 2>(code, ctx, fpcr_controlled, {result, xmm_a, xmm_b}, nan_mask, nan_handler);
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
template<typename F>
@@ -448,16 +448,16 @@ void EmitTwoOpFallbackWithoutRegAlloc(BlockOfCode& code, EmitContext& ctx, Xbyak
template<size_t fpcr_controlled_arg_index = 1, typename F>
void EmitTwoOpFallback(BlockOfCode& code, EmitContext& ctx, IR::Inst* inst, F lambda) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Xmm arg1 = ctx.reg_alloc.UseXmm(code, args[0]);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm arg1 = ctx.reg_alloc.UseXmm(args[0]);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm();
ctx.reg_alloc.EndOfAllocScope();
ctx.reg_alloc.HostCall(code, nullptr);
ctx.reg_alloc.HostCall(nullptr);
const bool fpcr_controlled = args[fpcr_controlled_arg_index].GetImmediateU1();
EmitTwoOpFallbackWithoutRegAlloc(code, ctx, result, arg1, lambda, fpcr_controlled);
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
template<typename Lambda>
@@ -501,17 +501,17 @@ void EmitThreeOpFallbackWithoutRegAlloc(BlockOfCode& code, EmitContext& ctx, Xby
template<typename Lambda>
void EmitThreeOpFallback(BlockOfCode& code, EmitContext& ctx, IR::Inst* inst, Lambda lambda) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Xmm arg1 = ctx.reg_alloc.UseXmm(code, args[0]);
const Xbyak::Xmm arg2 = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm arg1 = ctx.reg_alloc.UseXmm(args[0]);
const Xbyak::Xmm arg2 = ctx.reg_alloc.UseXmm(args[1]);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm();
ctx.reg_alloc.EndOfAllocScope();
ctx.reg_alloc.HostCall(code, nullptr);
ctx.reg_alloc.HostCall(nullptr);
const bool fpcr_controlled = args[2].GetImmediateU1();
EmitThreeOpFallbackWithoutRegAlloc(code, ctx, result, arg1, arg2, lambda, fpcr_controlled);
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
enum class LoadPreviousResult {
@@ -565,16 +565,16 @@ template<typename Lambda>
void EmitFourOpFallback(BlockOfCode& code, EmitContext& ctx, IR::Inst* inst, Lambda lambda) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const bool fpcr_controlled = args[3].GetImmediateU1();
const Xbyak::Xmm arg1 = ctx.reg_alloc.UseXmm(code, args[0]);
const Xbyak::Xmm arg2 = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm arg3 = ctx.reg_alloc.UseXmm(code, args[2]);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm arg1 = ctx.reg_alloc.UseXmm(args[0]);
const Xbyak::Xmm arg2 = ctx.reg_alloc.UseXmm(args[1]);
const Xbyak::Xmm arg3 = ctx.reg_alloc.UseXmm(args[2]);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm();
ctx.reg_alloc.EndOfAllocScope();
ctx.reg_alloc.HostCall(code, nullptr);
ctx.reg_alloc.HostCall(nullptr);
EmitFourOpFallbackWithoutRegAlloc(code, ctx, result, arg1, arg2, arg3, lambda, fpcr_controlled);
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
} // anonymous namespace
@@ -582,9 +582,9 @@ void EmitFourOpFallback(BlockOfCode& code, EmitContext& ctx, IR::Inst* inst, Lam
template<size_t fsize>
void FPVectorAbs(BlockOfCode& code, EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Xmm a = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm a = ctx.reg_alloc.UseScratchXmm(args[0]);
code.andps(a, GetNonSignMaskVector<fsize>(code));
ctx.reg_alloc.DefineValue(code, inst, a);
ctx.reg_alloc.DefineValue(inst, a);
}
void EmitX64::EmitFPVectorAbs16(EmitContext& ctx, IR::Inst* inst) {
@@ -626,29 +626,29 @@ void EmitX64::EmitFPVectorEqual16(EmitContext& ctx, IR::Inst* inst) {
void EmitX64::EmitFPVectorEqual32(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const bool fpcr_controlled = args[2].GetImmediateU1();
const Xbyak::Xmm a = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm b = ctx.FPCR(fpcr_controlled).FZ() ? ctx.reg_alloc.UseScratchXmm(code, args[1]) : ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm a = ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm b = ctx.FPCR(fpcr_controlled).FZ() ? ctx.reg_alloc.UseScratchXmm(args[1]) : ctx.reg_alloc.UseXmm(args[1]);
MaybeStandardFPSCRValue(code, ctx, fpcr_controlled, [&] {
DenormalsAreZero<32>(code, ctx.FPCR(fpcr_controlled), {a, b}, xmm0);
code.cmpeqps(a, b);
});
ctx.reg_alloc.DefineValue(code, inst, a);
ctx.reg_alloc.DefineValue(inst, a);
}
void EmitX64::EmitFPVectorEqual64(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const bool fpcr_controlled = args[2].GetImmediateU1();
const Xbyak::Xmm a = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm b = ctx.FPCR(fpcr_controlled).FZ() ? ctx.reg_alloc.UseScratchXmm(code, args[1]) : ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm a = ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm b = ctx.FPCR(fpcr_controlled).FZ() ? ctx.reg_alloc.UseScratchXmm(args[1]) : ctx.reg_alloc.UseXmm(args[1]);
MaybeStandardFPSCRValue(code, ctx, fpcr_controlled, [&] {
DenormalsAreZero<64>(code, ctx.FPCR(fpcr_controlled), {a, b}, xmm0);
code.cmpeqpd(a, b);
});
ctx.reg_alloc.DefineValue(code, inst, a);
ctx.reg_alloc.DefineValue(inst, a);
}
template<FP::RoundingMode rounding_mode>
@@ -664,13 +664,13 @@ void EmitX64::EmitFPVectorFromHalf32(EmitContext& ctx, IR::Inst* inst) {
if (code.HasHostFeature(HostFeature::F16C) && !ctx.FPCR().AHP() && !ctx.FPCR().FZ16()) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm value = ctx.reg_alloc.UseXmm(code, args[0]);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm();
const Xbyak::Xmm value = ctx.reg_alloc.UseXmm(args[0]);
code.vcvtph2ps(result, value);
ForceToDefaultNaN<32>(code, ctx.FPCR(fpcr_controlled), result);
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
} else {
switch (rounding_mode) {
case FP::RoundingMode::ToNearest_TieEven:
@@ -696,7 +696,7 @@ void EmitX64::EmitFPVectorFromHalf32(EmitContext& ctx, IR::Inst* inst) {
void EmitX64::EmitFPVectorFromSignedFixed32(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Xmm xmm = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm xmm = ctx.reg_alloc.UseScratchXmm(args[0]);
const int fbits = args[1].GetImmediateU8();
const FP::RoundingMode rounding_mode = static_cast<FP::RoundingMode>(args[2].GetImmediateU8());
const bool fpcr_controlled = args[3].GetImmediateU1();
@@ -709,12 +709,12 @@ void EmitX64::EmitFPVectorFromSignedFixed32(EmitContext& ctx, IR::Inst* inst) {
}
});
ctx.reg_alloc.DefineValue(code, inst, xmm);
ctx.reg_alloc.DefineValue(inst, xmm);
}
void EmitX64::EmitFPVectorFromSignedFixed64(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Xmm xmm = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm xmm = ctx.reg_alloc.UseScratchXmm(args[0]);
const int fbits = args[1].GetImmediateU8();
const FP::RoundingMode rounding_mode = static_cast<FP::RoundingMode>(args[2].GetImmediateU8());
const bool fpcr_controlled = args[3].GetImmediateU1();
@@ -724,8 +724,8 @@ void EmitX64::EmitFPVectorFromSignedFixed64(EmitContext& ctx, IR::Inst* inst) {
if (code.HasHostFeature(HostFeature::AVX512_OrthoFloat)) {
code.vcvtqq2pd(xmm, xmm);
} else if (code.HasHostFeature(HostFeature::SSE41)) {
const Xbyak::Xmm xmm_tmp = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Reg64 tmp = ctx.reg_alloc.ScratchGpr(code);
const Xbyak::Xmm xmm_tmp = ctx.reg_alloc.ScratchXmm();
const Xbyak::Reg64 tmp = ctx.reg_alloc.ScratchGpr();
// First quadword
code.movq(tmp, xmm);
@@ -738,9 +738,9 @@ void EmitX64::EmitFPVectorFromSignedFixed64(EmitContext& ctx, IR::Inst* inst) {
// Combine
code.unpcklpd(xmm, xmm_tmp);
} else {
const Xbyak::Xmm high_xmm = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm xmm_tmp = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Reg64 tmp = ctx.reg_alloc.ScratchGpr(code);
const Xbyak::Xmm high_xmm = ctx.reg_alloc.ScratchXmm();
const Xbyak::Xmm xmm_tmp = ctx.reg_alloc.ScratchXmm();
const Xbyak::Reg64 tmp = ctx.reg_alloc.ScratchGpr();
// First quadword
code.movhlps(high_xmm, xmm);
@@ -760,12 +760,12 @@ void EmitX64::EmitFPVectorFromSignedFixed64(EmitContext& ctx, IR::Inst* inst) {
}
});
ctx.reg_alloc.DefineValue(code, inst, xmm);
ctx.reg_alloc.DefineValue(inst, xmm);
}
void EmitX64::EmitFPVectorFromUnsignedFixed32(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Xmm xmm = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm xmm = ctx.reg_alloc.UseScratchXmm(args[0]);
const int fbits = args[1].GetImmediateU8();
const FP::RoundingMode rounding_mode = static_cast<FP::RoundingMode>(args[2].GetImmediateU8());
const bool fpcr_controlled = args[3].GetImmediateU1();
@@ -779,7 +779,7 @@ void EmitX64::EmitFPVectorFromUnsignedFixed32(EmitContext& ctx, IR::Inst* inst)
const Xbyak::Address mem_53000000 = code.BConst<32>(xword, 0x53000000);
const Xbyak::Address mem_D3000080 = code.BConst<32>(xword, 0xD3000080);
const Xbyak::Xmm tmp = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm tmp = ctx.reg_alloc.ScratchXmm();
if (code.HasHostFeature(HostFeature::AVX)) {
code.vpblendw(tmp, xmm, mem_4B000000, 0b10101010);
@@ -810,12 +810,12 @@ void EmitX64::EmitFPVectorFromUnsignedFixed32(EmitContext& ctx, IR::Inst* inst)
}
});
ctx.reg_alloc.DefineValue(code, inst, xmm);
ctx.reg_alloc.DefineValue(inst, xmm);
}
void EmitX64::EmitFPVectorFromUnsignedFixed64(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Xmm xmm = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm xmm = ctx.reg_alloc.UseScratchXmm(args[0]);
const int fbits = args[1].GetImmediateU8();
const FP::RoundingMode rounding_mode = static_cast<FP::RoundingMode>(args[2].GetImmediateU8());
const bool fpcr_controlled = args[3].GetImmediateU1();
@@ -828,9 +828,9 @@ void EmitX64::EmitFPVectorFromUnsignedFixed64(EmitContext& ctx, IR::Inst* inst)
const Xbyak::Address unpack = code.Const(xword, 0x4530000043300000, 0);
const Xbyak::Address subtrahend = code.Const(xword, 0x4330000000000000, 0x4530000000000000);
const Xbyak::Xmm unpack_reg = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm subtrahend_reg = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm tmp1 = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm unpack_reg = ctx.reg_alloc.ScratchXmm();
const Xbyak::Xmm subtrahend_reg = ctx.reg_alloc.ScratchXmm();
const Xbyak::Xmm tmp1 = ctx.reg_alloc.ScratchXmm();
if (code.HasHostFeature(HostFeature::AVX)) {
code.vmovapd(unpack_reg, unpack);
@@ -846,7 +846,7 @@ void EmitX64::EmitFPVectorFromUnsignedFixed64(EmitContext& ctx, IR::Inst* inst)
code.vhaddpd(xmm, tmp1, xmm);
} else {
const Xbyak::Xmm tmp2 = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm tmp2 = ctx.reg_alloc.ScratchXmm();
code.movapd(unpack_reg, unpack);
code.movapd(subtrahend_reg, subtrahend);
@@ -877,63 +877,63 @@ void EmitX64::EmitFPVectorFromUnsignedFixed64(EmitContext& ctx, IR::Inst* inst)
}
});
ctx.reg_alloc.DefineValue(code, inst, xmm);
ctx.reg_alloc.DefineValue(inst, xmm);
}
void EmitX64::EmitFPVectorGreater32(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const bool fpcr_controlled = args[2].GetImmediateU1();
const Xbyak::Xmm a = ctx.FPCR(fpcr_controlled).FZ() ? ctx.reg_alloc.UseScratchXmm(code, args[0]) : ctx.reg_alloc.UseXmm(code, args[0]);
const Xbyak::Xmm b = ctx.reg_alloc.UseScratchXmm(code, args[1]);
const Xbyak::Xmm a = ctx.FPCR(fpcr_controlled).FZ() ? ctx.reg_alloc.UseScratchXmm(args[0]) : ctx.reg_alloc.UseXmm(args[0]);
const Xbyak::Xmm b = ctx.reg_alloc.UseScratchXmm(args[1]);
MaybeStandardFPSCRValue(code, ctx, fpcr_controlled, [&] {
DenormalsAreZero<32>(code, ctx.FPCR(fpcr_controlled), {a, b}, xmm0);
code.cmpltps(b, a);
});
ctx.reg_alloc.DefineValue(code, inst, b);
ctx.reg_alloc.DefineValue(inst, b);
}
void EmitX64::EmitFPVectorGreater64(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const bool fpcr_controlled = args[2].GetImmediateU1();
const Xbyak::Xmm a = ctx.FPCR(fpcr_controlled).FZ() ? ctx.reg_alloc.UseScratchXmm(code, args[0]) : ctx.reg_alloc.UseXmm(code, args[0]);
const Xbyak::Xmm b = ctx.reg_alloc.UseScratchXmm(code, args[1]);
const Xbyak::Xmm a = ctx.FPCR(fpcr_controlled).FZ() ? ctx.reg_alloc.UseScratchXmm(args[0]) : ctx.reg_alloc.UseXmm(args[0]);
const Xbyak::Xmm b = ctx.reg_alloc.UseScratchXmm(args[1]);
MaybeStandardFPSCRValue(code, ctx, fpcr_controlled, [&] {
DenormalsAreZero<64>(code, ctx.FPCR(fpcr_controlled), {a, b}, xmm0);
code.cmpltpd(b, a);
});
ctx.reg_alloc.DefineValue(code, inst, b);
ctx.reg_alloc.DefineValue(inst, b);
}
void EmitX64::EmitFPVectorGreaterEqual32(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const bool fpcr_controlled = args[2].GetImmediateU1();
const Xbyak::Xmm a = ctx.FPCR(fpcr_controlled).FZ() ? ctx.reg_alloc.UseScratchXmm(code, args[0]) : ctx.reg_alloc.UseXmm(code, args[0]);
const Xbyak::Xmm b = ctx.reg_alloc.UseScratchXmm(code, args[1]);
const Xbyak::Xmm a = ctx.FPCR(fpcr_controlled).FZ() ? ctx.reg_alloc.UseScratchXmm(args[0]) : ctx.reg_alloc.UseXmm(args[0]);
const Xbyak::Xmm b = ctx.reg_alloc.UseScratchXmm(args[1]);
MaybeStandardFPSCRValue(code, ctx, fpcr_controlled, [&] {
DenormalsAreZero<32>(code, ctx.FPCR(fpcr_controlled), {a, b}, xmm0);
code.cmpleps(b, a);
});
ctx.reg_alloc.DefineValue(code, inst, b);
ctx.reg_alloc.DefineValue(inst, b);
}
void EmitX64::EmitFPVectorGreaterEqual64(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const bool fpcr_controlled = args[2].GetImmediateU1();
const Xbyak::Xmm a = ctx.FPCR(fpcr_controlled).FZ() ? ctx.reg_alloc.UseScratchXmm(code, args[0]) : ctx.reg_alloc.UseXmm(code, args[0]);
const Xbyak::Xmm b = ctx.reg_alloc.UseScratchXmm(code, args[1]);
const Xbyak::Xmm a = ctx.FPCR(fpcr_controlled).FZ() ? ctx.reg_alloc.UseScratchXmm(args[0]) : ctx.reg_alloc.UseXmm(args[0]);
const Xbyak::Xmm b = ctx.reg_alloc.UseScratchXmm(args[1]);
MaybeStandardFPSCRValue(code, ctx, fpcr_controlled, [&] {
DenormalsAreZero<64>(code, ctx.FPCR(fpcr_controlled), {a, b}, xmm0);
code.cmplepd(b, a);
});
ctx.reg_alloc.DefineValue(code, inst, b);
ctx.reg_alloc.DefineValue(inst, b);
}
template<size_t fsize, bool is_max>
@@ -942,12 +942,12 @@ static void EmitFPVectorMinMax(BlockOfCode& code, EmitContext& ctx, IR::Inst* in
if (ctx.FPCR(fpcr_controlled).DN()) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Xmm result = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm xmm_b = ctx.FPCR(fpcr_controlled).FZ() ? ctx.reg_alloc.UseScratchXmm(code, args[1]) : ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm result = ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm xmm_b = ctx.FPCR(fpcr_controlled).FZ() ? ctx.reg_alloc.UseScratchXmm(args[1]) : ctx.reg_alloc.UseXmm(args[1]);
const Xbyak::Xmm mask = xmm0;
const Xbyak::Xmm eq = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm nan_mask = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm eq = ctx.reg_alloc.ScratchXmm();
const Xbyak::Xmm nan_mask = ctx.reg_alloc.ScratchXmm();
MaybeStandardFPSCRValue(code, ctx, fpcr_controlled, [&] {
DenormalsAreZero<fsize>(code, ctx.FPCR(fpcr_controlled), {result, xmm_b}, mask);
@@ -994,7 +994,7 @@ static void EmitFPVectorMinMax(BlockOfCode& code, EmitContext& ctx, IR::Inst* in
}
});
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
return;
}
@@ -1002,11 +1002,11 @@ static void EmitFPVectorMinMax(BlockOfCode& code, EmitContext& ctx, IR::Inst* in
EmitThreeOpVectorOperation<fsize, DefaultIndexer>(
code, ctx, inst, [&](const Xbyak::Xmm& result, Xbyak::Xmm xmm_b) {
const Xbyak::Xmm mask = xmm0;
const Xbyak::Xmm eq = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm eq = ctx.reg_alloc.ScratchXmm();
if (ctx.FPCR(fpcr_controlled).FZ()) {
const Xbyak::Xmm prev_xmm_b = xmm_b;
xmm_b = ctx.reg_alloc.ScratchXmm(code);
xmm_b = ctx.reg_alloc.ScratchXmm();
code.movaps(xmm_b, prev_xmm_b);
DenormalsAreZero<fsize>(code, ctx.FPCR(fpcr_controlled), {result, xmm_b}, mask);
}
@@ -1053,13 +1053,13 @@ static void EmitFPVectorMinMaxNumeric(BlockOfCode& code, EmitContext& ctx, IR::I
const bool fpcr_controlled = inst->GetArg(2).GetU1();
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseScratchXmm(code, args[1]);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm intermediate_result = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseScratchXmm(args[1]);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm();
const Xbyak::Xmm intermediate_result = ctx.reg_alloc.ScratchXmm();
const Xbyak::Xmm tmp1 = xmm0;
const Xbyak::Xmm tmp2 = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm tmp2 = ctx.reg_alloc.ScratchXmm();
// NaN requirements:
// op1 op2 result
@@ -1139,7 +1139,7 @@ static void EmitFPVectorMinMaxNumeric(BlockOfCode& code, EmitContext& ctx, IR::I
}
});
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
return;
}
@@ -1230,7 +1230,7 @@ static void EmitFPVectorMinMaxNumeric(BlockOfCode& code, EmitContext& ctx, IR::I
}
});
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
void EmitX64::EmitFPVectorMax32(EmitContext& ctx, IR::Inst* inst) {
@@ -1316,27 +1316,27 @@ void EmitFPVectorMulAdd(BlockOfCode& code, EmitContext& ctx, IR::Inst* inst) {
if (code.HasHostFeature(HostFeature::FMA) && !needs_rounding_correction && !needs_nan_correction) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Xmm result = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm xmm_c = ctx.reg_alloc.UseXmm(code, args[2]);
const Xbyak::Xmm result = ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseXmm(args[1]);
const Xbyak::Xmm xmm_c = ctx.reg_alloc.UseXmm(args[2]);
MaybeStandardFPSCRValue(code, ctx, fpcr_controlled, [&] {
FCODE(vfmadd231p)(result, xmm_b, xmm_c);
ForceToDefaultNaN<fsize>(code, ctx.FPCR(fpcr_controlled), result);
});
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
return;
}
if (code.HasHostFeature(HostFeature::FMA | HostFeature::AVX)) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseXmm(code, args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm xmm_c = ctx.reg_alloc.UseXmm(code, args[2]);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm tmp = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseXmm(args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseXmm(args[1]);
const Xbyak::Xmm xmm_c = ctx.reg_alloc.UseXmm(args[2]);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm();
const Xbyak::Xmm tmp = ctx.reg_alloc.ScratchXmm();
SharedLabel end = GenSharedLabel(), fallback = GenSharedLabel();
@@ -1375,21 +1375,21 @@ void EmitFPVectorMulAdd(BlockOfCode& code, EmitContext& ctx, IR::Inst* inst) {
code.jmp(*end, code.T_NEAR);
});
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
return;
}
if (ctx.HasOptimization(OptimizationFlag::Unsafe_UnfuseFMA)) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Xmm operand1 = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm operand2 = ctx.reg_alloc.UseScratchXmm(code, args[1]);
const Xbyak::Xmm operand3 = ctx.reg_alloc.UseXmm(code, args[2]);
const Xbyak::Xmm operand1 = ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm operand2 = ctx.reg_alloc.UseScratchXmm(args[1]);
const Xbyak::Xmm operand3 = ctx.reg_alloc.UseXmm(args[2]);
FCODE(mulp)(operand2, operand3);
FCODE(addp)(operand1, operand2);
ctx.reg_alloc.DefineValue(code, inst, operand1);
ctx.reg_alloc.DefineValue(inst, operand1);
return;
}
}
@@ -1417,10 +1417,10 @@ static void EmitFPVectorMulX(BlockOfCode& code, EmitContext& ctx, IR::Inst* inst
const bool fpcr_controlled = args[2].GetImmediateU1();
if (ctx.FPCR(fpcr_controlled).DN() && code.HasHostFeature(HostFeature::AVX)) {
const Xbyak::Xmm result = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm operand = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm tmp = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm twos = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm result = ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm operand = ctx.reg_alloc.UseXmm(args[1]);
const Xbyak::Xmm tmp = ctx.reg_alloc.ScratchXmm();
const Xbyak::Xmm twos = ctx.reg_alloc.ScratchXmm();
MaybeStandardFPSCRValue(code, ctx, fpcr_controlled, [&] {
FCODE(vcmpunordp)(xmm0, result, operand);
@@ -1434,14 +1434,14 @@ static void EmitFPVectorMulX(BlockOfCode& code, EmitContext& ctx, IR::Inst* inst
FCODE(blendvp)(result, twos);
});
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
return;
}
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseXmm(code, args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm nan_mask = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm();
const Xbyak::Xmm xmm_a = ctx.reg_alloc.UseXmm(args[0]);
const Xbyak::Xmm xmm_b = ctx.reg_alloc.UseXmm(args[1]);
const Xbyak::Xmm nan_mask = ctx.reg_alloc.ScratchXmm();
code.movaps(nan_mask, xmm_b);
code.movaps(result, xmm_a);
@@ -1464,7 +1464,7 @@ static void EmitFPVectorMulX(BlockOfCode& code, EmitContext& ctx, IR::Inst* inst
HandleNaNs<fsize, 2>(code, ctx, fpcr_controlled, {result, xmm_a, xmm_b}, nan_mask, nan_handler);
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
void EmitX64::EmitFPVectorMulX32(EmitContext& ctx, IR::Inst* inst) {
@@ -1482,12 +1482,12 @@ void FPVectorNeg(BlockOfCode& code, EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Xmm a = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm a = ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Address mask = code.BConst<fsize>(xword, sign_mask);
code.xorps(a, mask);
ctx.reg_alloc.DefineValue(code, inst, a);
ctx.reg_alloc.DefineValue(inst, a);
}
void EmitX64::EmitFPVectorNeg16(EmitContext& ctx, IR::Inst* inst) {
@@ -1512,7 +1512,7 @@ void EmitX64::EmitFPVectorPairedAdd64(EmitContext& ctx, IR::Inst* inst) {
void EmitX64::EmitFPVectorPairedAddLower32(EmitContext& ctx, IR::Inst* inst) {
EmitThreeOpVectorOperation<32, PairedLowerIndexer>(code, ctx, inst, [&](Xbyak::Xmm result, Xbyak::Xmm xmm_b) {
const Xbyak::Xmm zero = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm zero = ctx.reg_alloc.ScratchXmm();
code.xorps(zero, zero);
code.punpcklqdq(result, xmm_b);
code.haddps(result, zero);
@@ -1521,7 +1521,7 @@ void EmitX64::EmitFPVectorPairedAddLower32(EmitContext& ctx, IR::Inst* inst) {
void EmitX64::EmitFPVectorPairedAddLower64(EmitContext& ctx, IR::Inst* inst) {
EmitThreeOpVectorOperation<64, PairedLowerIndexer>(code, ctx, inst, [&](Xbyak::Xmm result, Xbyak::Xmm xmm_b) {
const Xbyak::Xmm zero = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm zero = ctx.reg_alloc.ScratchXmm();
code.xorps(zero, zero);
code.punpcklqdq(result, xmm_b);
code.haddpd(result, zero);
@@ -1535,8 +1535,8 @@ static void EmitRecipEstimate(BlockOfCode& code, EmitContext& ctx, IR::Inst* ins
if constexpr (fsize != 16) {
if (ctx.HasOptimization(OptimizationFlag::Unsafe_ReducedErrorFP)) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Xmm operand = ctx.reg_alloc.UseXmm(code, args[0]);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm operand = ctx.reg_alloc.UseXmm(args[0]);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm();
if (code.HasHostFeature(HostFeature::AVX512_OrthoFloat)) {
FCODE(vrcp14p)(result, operand);
@@ -1550,7 +1550,7 @@ static void EmitRecipEstimate(BlockOfCode& code, EmitContext& ctx, IR::Inst* ins
}
}
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
return;
}
}
@@ -1589,16 +1589,16 @@ static void EmitRecipStepFused(BlockOfCode& code, EmitContext& ctx, IR::Inst* in
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const bool fpcr_controlled = args[2].GetImmediateU1();
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm operand1 = ctx.reg_alloc.UseXmm(code, args[0]);
const Xbyak::Xmm operand2 = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm();
const Xbyak::Xmm operand1 = ctx.reg_alloc.UseXmm(args[0]);
const Xbyak::Xmm operand2 = ctx.reg_alloc.UseXmm(args[1]);
MaybeStandardFPSCRValue(code, ctx, fpcr_controlled, [&] {
code.movaps(result, GetVectorOf<fsize, false, 0, 2>(code));
FCODE(vfnmadd231p)(result, operand1, operand2);
});
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
return;
}
@@ -1606,10 +1606,10 @@ static void EmitRecipStepFused(BlockOfCode& code, EmitContext& ctx, IR::Inst* in
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const bool fpcr_controlled = args[2].GetImmediateU1();
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm operand1 = ctx.reg_alloc.UseXmm(code, args[0]);
const Xbyak::Xmm operand2 = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm tmp = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm();
const Xbyak::Xmm operand1 = ctx.reg_alloc.UseXmm(args[0]);
const Xbyak::Xmm operand2 = ctx.reg_alloc.UseXmm(args[1]);
const Xbyak::Xmm tmp = ctx.reg_alloc.ScratchXmm();
SharedLabel end = GenSharedLabel(), fallback = GenSharedLabel();
@@ -1633,22 +1633,22 @@ static void EmitRecipStepFused(BlockOfCode& code, EmitContext& ctx, IR::Inst* in
code.jmp(*end, code.T_NEAR);
});
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
return;
}
if (ctx.HasOptimization(OptimizationFlag::Unsafe_UnfuseFMA)) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Xmm operand1 = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm operand2 = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm operand1 = ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm operand2 = ctx.reg_alloc.UseXmm(args[1]);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm();
code.movaps(result, GetVectorOf<fsize, false, 0, 2>(code));
FCODE(mulp)(operand1, operand2);
FCODE(subp)(result, operand1);
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
return;
}
}
@@ -1757,8 +1757,8 @@ static void EmitRSqrtEstimate(BlockOfCode& code, EmitContext& ctx, IR::Inst* ins
if constexpr (fsize != 16) {
if (ctx.HasOptimization(OptimizationFlag::Unsafe_ReducedErrorFP)) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Xmm operand = ctx.reg_alloc.UseXmm(code, args[0]);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm operand = ctx.reg_alloc.UseXmm(args[0]);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm();
if (code.HasHostFeature(HostFeature::AVX512_OrthoFloat)) {
FCODE(vrsqrt14p)(result, operand);
@@ -1772,7 +1772,7 @@ static void EmitRSqrtEstimate(BlockOfCode& code, EmitContext& ctx, IR::Inst* ins
}
}
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
return;
}
@@ -1780,9 +1780,9 @@ static void EmitRSqrtEstimate(BlockOfCode& code, EmitContext& ctx, IR::Inst* ins
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const bool fpcr_controlled = args[1].GetImmediateU1();
const Xbyak::Xmm operand = ctx.reg_alloc.UseXmm(code, args[0]);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm value = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm operand = ctx.reg_alloc.UseXmm(args[0]);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm();
const Xbyak::Xmm value = ctx.reg_alloc.ScratchXmm();
SharedLabel bad_values = GenSharedLabel(), end = GenSharedLabel();
@@ -1816,7 +1816,7 @@ static void EmitRSqrtEstimate(BlockOfCode& code, EmitContext& ctx, IR::Inst* ins
code.jmp(*end, code.T_NEAR);
});
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
return;
}
}
@@ -1851,9 +1851,9 @@ static void EmitRSqrtStepFused(BlockOfCode& code, EmitContext& ctx, IR::Inst* in
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const bool fpcr_controlled = args[2].GetImmediateU1();
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm operand1 = ctx.reg_alloc.UseXmm(code, args[0]);
const Xbyak::Xmm operand2 = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm();
const Xbyak::Xmm operand1 = ctx.reg_alloc.UseXmm(args[0]);
const Xbyak::Xmm operand2 = ctx.reg_alloc.UseXmm(args[1]);
MaybeStandardFPSCRValue(code, ctx, fpcr_controlled, [&] {
code.vmovaps(result, GetVectorOf<fsize, false, 0, 3>(code));
@@ -1861,7 +1861,7 @@ static void EmitRSqrtStepFused(BlockOfCode& code, EmitContext& ctx, IR::Inst* in
FCODE(vmulp)(result, result, GetVectorOf<fsize, false, -1, 1>(code));
});
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
return;
}
@@ -1869,11 +1869,11 @@ static void EmitRSqrtStepFused(BlockOfCode& code, EmitContext& ctx, IR::Inst* in
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const bool fpcr_controlled = args[2].GetImmediateU1();
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm operand1 = ctx.reg_alloc.UseXmm(code, args[0]);
const Xbyak::Xmm operand2 = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm tmp = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm mask = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm();
const Xbyak::Xmm operand1 = ctx.reg_alloc.UseXmm(args[0]);
const Xbyak::Xmm operand2 = ctx.reg_alloc.UseXmm(args[1]);
const Xbyak::Xmm tmp = ctx.reg_alloc.ScratchXmm();
const Xbyak::Xmm mask = ctx.reg_alloc.ScratchXmm();
SharedLabel end = GenSharedLabel(), fallback = GenSharedLabel();
@@ -1902,23 +1902,23 @@ static void EmitRSqrtStepFused(BlockOfCode& code, EmitContext& ctx, IR::Inst* in
code.jmp(*end, code.T_NEAR);
});
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
return;
}
if (ctx.HasOptimization(OptimizationFlag::Unsafe_UnfuseFMA)) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Xmm operand1 = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm operand2 = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm operand1 = ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm operand2 = ctx.reg_alloc.UseXmm(args[1]);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm();
code.movaps(result, GetVectorOf<fsize, false, 0, 3>(code));
FCODE(mulp)(operand1, operand2);
FCODE(subp)(result, operand1);
FCODE(mulp)(result, GetVectorOf<fsize, false, -1, 1>(code));
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
return;
}
}
@@ -1972,12 +1972,12 @@ void EmitX64::EmitFPVectorToHalf32(EmitContext& ctx, IR::Inst* inst) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const auto round_imm = ConvertRoundingModeToX64Immediate(rounding_mode);
const Xbyak::Xmm result = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm result = ctx.reg_alloc.UseScratchXmm(args[0]);
ForceToDefaultNaN<32>(code, ctx.FPCR(fpcr_controlled), result);
code.vcvtps2ph(result, result, u8(*round_imm));
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
} else {
switch (rounding_mode) {
case FP::RoundingMode::ToNearest_TieEven:
@@ -2018,7 +2018,7 @@ void EmitFPVectorToFixed(BlockOfCode& code, EmitContext& ctx, IR::Inst* inst) {
if (code.HasHostFeature(HostFeature::SSE41) && rounding != FP::RoundingMode::ToNearest_TieAwayFromZero) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Xmm src = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm src = ctx.reg_alloc.UseScratchXmm(args[0]);
MaybeStandardFPSCRValue(code, ctx, fpcr_controlled, [&] {
const int round_imm = [&] {
@@ -2045,8 +2045,8 @@ void EmitFPVectorToFixed(BlockOfCode& code, EmitContext& ctx, IR::Inst* inst) {
if (code.HasHostFeature(HostFeature::AVX512_OrthoFloat)) {
code.vcvttpd2qq(src, src);
} else {
const Xbyak::Reg64 hi = ctx.reg_alloc.ScratchGpr(code);
const Xbyak::Reg64 lo = ctx.reg_alloc.ScratchGpr(code);
const Xbyak::Reg64 hi = ctx.reg_alloc.ScratchGpr();
const Xbyak::Reg64 lo = ctx.reg_alloc.ScratchGpr();
code.cvttsd2si(lo, src);
code.punpckhqdq(src, src);
@@ -2093,12 +2093,12 @@ void EmitFPVectorToFixed(BlockOfCode& code, EmitContext& ctx, IR::Inst* inst) {
FCODE(andp)(src, xmm0);
// Will we exceed unsigned range?
const Xbyak::Xmm exceed_unsigned = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm exceed_unsigned = ctx.reg_alloc.ScratchXmm();
code.movaps(exceed_unsigned, GetVectorOf<fsize, float_upper_limit_unsigned>(code));
FCODE(cmplep)(exceed_unsigned, src);
// Will be exceed signed range?
const Xbyak::Xmm tmp = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm tmp = ctx.reg_alloc.ScratchXmm();
code.movaps(tmp, GetVectorOf<fsize, float_upper_limit_signed>(code));
code.movaps(xmm0, tmp);
FCODE(cmplep)(xmm0, src);
@@ -2122,7 +2122,7 @@ void EmitFPVectorToFixed(BlockOfCode& code, EmitContext& ctx, IR::Inst* inst) {
}
});
ctx.reg_alloc.DefineValue(code, inst, src);
ctx.reg_alloc.DefineValue(inst, src);
return;
}
}

View File

@@ -26,9 +26,9 @@ namespace {
void EmitVectorSaturatedNative(BlockOfCode& code, EmitContext& ctx, IR::Inst* inst, void (Xbyak::CodeGenerator::*saturated_fn)(const Xbyak::Mmx& mmx, const Xbyak::Operand&), void (Xbyak::CodeGenerator::*unsaturated_fn)(const Xbyak::Mmx& mmx, const Xbyak::Operand&), void (Xbyak::CodeGenerator::*sub_fn)(const Xbyak::Mmx& mmx, const Xbyak::Operand&)) {
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
const Xbyak::Xmm result = ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm addend = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Reg8 overflow = ctx.reg_alloc.ScratchGpr(code).cvt8();
const Xbyak::Xmm result = ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm addend = ctx.reg_alloc.UseXmm(args[1]);
const Xbyak::Reg8 overflow = ctx.reg_alloc.ScratchGpr().cvt8();
code.movaps(xmm0, result);
@@ -39,7 +39,7 @@ void EmitVectorSaturatedNative(BlockOfCode& code, EmitContext& ctx, IR::Inst* in
if (code.HasHostFeature(HostFeature::SSE41)) {
code.ptest(xmm0, xmm0);
} else {
const Xbyak::Xmm tmp = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm tmp = ctx.reg_alloc.ScratchXmm();
code.pxor(tmp, tmp);
code.pcmpeqw(xmm0, tmp);
code.pmovmskb(overflow.cvt32(), xmm0);
@@ -49,7 +49,7 @@ void EmitVectorSaturatedNative(BlockOfCode& code, EmitContext& ctx, IR::Inst* in
code.setnz(overflow);
code.or_(code.byte[code.ABI_JIT_PTR + code.GetJitStateInfo().offsetof_fpsr_qc], overflow);
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
}
enum class Op {
@@ -65,10 +65,10 @@ void EmitVectorSignedSaturated(BlockOfCode& code, EmitContext& ctx, IR::Inst* in
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
if (code.HasHostFeature(HostFeature::AVX512_Ortho | HostFeature::AVX512DQ)) {
const Xbyak::Xmm operand1 = ctx.reg_alloc.UseXmm(code, args[0]);
const Xbyak::Xmm operand2 = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Reg8 overflow = ctx.reg_alloc.ScratchGpr(code).cvt8();
const Xbyak::Xmm operand1 = ctx.reg_alloc.UseXmm(args[0]);
const Xbyak::Xmm operand2 = ctx.reg_alloc.UseXmm(args[1]);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm();
const Xbyak::Reg8 overflow = ctx.reg_alloc.ScratchGpr().cvt8();
code.movaps(xmm0, operand1);
@@ -91,15 +91,15 @@ void EmitVectorSignedSaturated(BlockOfCode& code, EmitContext& ctx, IR::Inst* in
code.setnz(overflow);
code.or_(code.byte[code.ABI_JIT_PTR + code.GetJitStateInfo().offsetof_fpsr_qc], overflow);
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
return;
}
const Xbyak::Xmm operand1 = code.HasHostFeature(HostFeature::AVX) ? ctx.reg_alloc.UseXmm(code, args[0]) : ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm operand2 = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm result = code.HasHostFeature(HostFeature::AVX) ? ctx.reg_alloc.ScratchXmm(code) : operand1;
const Xbyak::Reg8 overflow = ctx.reg_alloc.ScratchGpr(code).cvt8();
const Xbyak::Xmm tmp = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm operand1 = code.HasHostFeature(HostFeature::AVX) ? ctx.reg_alloc.UseXmm(args[0]) : ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm operand2 = ctx.reg_alloc.UseXmm(args[1]);
const Xbyak::Xmm result = code.HasHostFeature(HostFeature::AVX) ? ctx.reg_alloc.ScratchXmm() : operand1;
const Xbyak::Reg8 overflow = ctx.reg_alloc.ScratchGpr().cvt8();
const Xbyak::Xmm tmp = ctx.reg_alloc.ScratchXmm();
if (code.HasHostFeature(HostFeature::AVX)) {
if constexpr (op == Op::Add) {
@@ -150,7 +150,7 @@ void EmitVectorSignedSaturated(BlockOfCode& code, EmitContext& ctx, IR::Inst* in
if (code.HasHostFeature(HostFeature::SSE41)) {
FCODE(blendvp)(result, tmp);
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
} else {
code.psrad(xmm0, 31);
if constexpr (esize == 64) {
@@ -161,7 +161,7 @@ void EmitVectorSignedSaturated(BlockOfCode& code, EmitContext& ctx, IR::Inst* in
code.pandn(xmm0, result);
code.por(tmp, xmm0);
ctx.reg_alloc.DefineValue(code, inst, tmp);
ctx.reg_alloc.DefineValue(inst, tmp);
}
}
@@ -172,10 +172,10 @@ void EmitVectorUnsignedSaturated(BlockOfCode& code, EmitContext& ctx, IR::Inst*
auto args = ctx.reg_alloc.GetArgumentInfo(inst);
if (code.HasHostFeature(HostFeature::AVX512_Ortho | HostFeature::AVX512DQ)) {
const Xbyak::Xmm operand1 = ctx.reg_alloc.UseXmm(code, args[0]);
const Xbyak::Xmm operand2 = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Reg8 overflow = ctx.reg_alloc.ScratchGpr(code).cvt8();
const Xbyak::Xmm operand1 = ctx.reg_alloc.UseXmm(args[0]);
const Xbyak::Xmm operand2 = ctx.reg_alloc.UseXmm(args[1]);
const Xbyak::Xmm result = ctx.reg_alloc.ScratchXmm();
const Xbyak::Reg8 overflow = ctx.reg_alloc.ScratchGpr().cvt8();
if constexpr (op == Op::Add) {
ICODE(vpadd)(result, operand1, operand2);
@@ -191,15 +191,15 @@ void EmitVectorUnsignedSaturated(BlockOfCode& code, EmitContext& ctx, IR::Inst*
code.setnz(overflow);
code.or_(code.byte[code.ABI_JIT_PTR + code.GetJitStateInfo().offsetof_fpsr_qc], overflow);
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
return;
}
const Xbyak::Xmm operand1 = code.HasHostFeature(HostFeature::AVX) ? ctx.reg_alloc.UseXmm(code, args[0]) : ctx.reg_alloc.UseScratchXmm(code, args[0]);
const Xbyak::Xmm operand2 = ctx.reg_alloc.UseXmm(code, args[1]);
const Xbyak::Xmm result = code.HasHostFeature(HostFeature::AVX) ? ctx.reg_alloc.ScratchXmm(code) : operand1;
const Xbyak::Reg8 overflow = ctx.reg_alloc.ScratchGpr(code).cvt8();
const Xbyak::Xmm tmp = ctx.reg_alloc.ScratchXmm(code);
const Xbyak::Xmm operand1 = code.HasHostFeature(HostFeature::AVX) ? ctx.reg_alloc.UseXmm(args[0]) : ctx.reg_alloc.UseScratchXmm(args[0]);
const Xbyak::Xmm operand2 = ctx.reg_alloc.UseXmm(args[1]);
const Xbyak::Xmm result = code.HasHostFeature(HostFeature::AVX) ? ctx.reg_alloc.ScratchXmm() : operand1;
const Xbyak::Reg8 overflow = ctx.reg_alloc.ScratchGpr().cvt8();
const Xbyak::Xmm tmp = ctx.reg_alloc.ScratchXmm();
if constexpr (op == Op::Add) {
if (code.HasHostFeature(HostFeature::AVX)) {
@@ -252,10 +252,10 @@ void EmitVectorUnsignedSaturated(BlockOfCode& code, EmitContext& ctx, IR::Inst*
if constexpr (op == Op::Add) {
code.por(result, tmp);
ctx.reg_alloc.DefineValue(code, inst, result);
ctx.reg_alloc.DefineValue(inst, result);
} else {
code.pandn(tmp, result);
ctx.reg_alloc.DefineValue(code, inst, tmp);
ctx.reg_alloc.DefineValue(inst, tmp);
}
}

View File

@@ -0,0 +1,25 @@
/* This file is part of the dynarmic project.
* Copyright (c) 2016 MerryMage
* SPDX-License-Identifier: 0BSD
*/
#include "dynarmic/backend/x64/hostloc.h"
#include <xbyak/xbyak.h>
#include "dynarmic/backend/x64/abi.h"
#include "dynarmic/backend/x64/stack_layout.h"
namespace Dynarmic::Backend::X64 {
Xbyak::Reg64 HostLocToReg64(HostLoc loc) {
ASSERT(HostLocIsGPR(loc));
return Xbyak::Reg64(static_cast<int>(loc));
}
Xbyak::Xmm HostLocToXmm(HostLoc loc) {
ASSERT(HostLocIsXMM(loc));
return Xbyak::Xmm(static_cast<int>(loc) - static_cast<int>(HostLoc::XMM0));
}
} // namespace Dynarmic::Backend::X64

View File

@@ -152,14 +152,7 @@ const HostLocList any_xmm = {
HostLoc::XMM15,
};
inline Xbyak::Reg64 HostLocToReg64(HostLoc loc) noexcept {
ASSERT(HostLocIsGPR(loc));
return Xbyak::Reg64(int(loc));
}
inline Xbyak::Xmm HostLocToXmm(HostLoc loc) noexcept {
ASSERT(HostLocIsXMM(loc));
return Xbyak::Xmm(int(loc) - int(HostLoc::XMM0));
}
Xbyak::Reg64 HostLocToReg64(HostLoc loc);
Xbyak::Xmm HostLocToXmm(HostLoc loc);
} // namespace Dynarmic::Backend::X64

View File

@@ -24,6 +24,15 @@
namespace Dynarmic::Backend::X64 {
#define MAYBE_AVX(OPCODE, ...) \
[&] { \
if (code->HasHostFeature(HostFeature::AVX)) { \
code->v##OPCODE(__VA_ARGS__); \
} else { \
code->OPCODE(__VA_ARGS__); \
} \
}()
static inline bool CanExchange(const HostLoc a, const HostLoc b) noexcept {
return HostLocIsGPR(a) && HostLocIsGPR(b);
}
@@ -98,14 +107,14 @@ void HostLocInfo::AddValue(IR::Inst* inst) noexcept {
max_bit_width = std::max<uint8_t>(max_bit_width, std::countr_zero(GetBitWidth(inst->GetType())));
}
void HostLocInfo::EmitVerboseDebuggingOutput(BlockOfCode& code, size_t host_loc_index) const noexcept {
void HostLocInfo::EmitVerboseDebuggingOutput(BlockOfCode* code, size_t host_loc_index) const noexcept {
using namespace Xbyak::util;
for (auto const value : values) {
code.mov(code.ABI_PARAM1, rsp);
code.mov(code.ABI_PARAM2, host_loc_index);
code.mov(code.ABI_PARAM3, value->GetName());
code.mov(code.ABI_PARAM4, GetBitWidth(value->GetType()));
code.CallFunction(PrintVerboseDebuggingOutputLine);
code->mov(code->ABI_PARAM1, rsp);
code->mov(code->ABI_PARAM2, host_loc_index);
code->mov(code->ABI_PARAM3, value->GetName());
code->mov(code->ABI_PARAM4, GetBitWidth(value->GetType()));
code->CallFunction(PrintVerboseDebuggingOutputLine);
}
}
@@ -119,7 +128,7 @@ bool Argument::FitsInImmediateU32() const noexcept {
bool Argument::FitsInImmediateS32() const noexcept {
if (!IsImmediate())
return false;
const s64 imm = s64(value.GetImmediateAsU64());
const s64 imm = static_cast<s64>(value.GetImmediateAsU64());
return -s64(0x80000000) <= imm && imm <= s64(0x7FFFFFFF);
}
@@ -165,38 +174,36 @@ IR::AccType Argument::GetImmediateAccType() const noexcept {
}
/// Is this value currently in a GPR?
bool Argument::IsInGpr(RegAlloc& reg_alloc) const noexcept {
bool Argument::IsInGpr() const noexcept {
if (IsImmediate())
return false;
return HostLocIsGPR(*reg_alloc.ValueLocation(value.GetInst()));
}
/// Is this value currently in a XMM?
bool Argument::IsInXmm(RegAlloc& reg_alloc) const noexcept {
bool Argument::IsInXmm() const noexcept {
if (IsImmediate())
return false;
return HostLocIsXMM(*reg_alloc.ValueLocation(value.GetInst()));
}
/// Is this value currently in memory?
bool Argument::IsInMemory(RegAlloc& reg_alloc) const noexcept {
bool Argument::IsInMemory() const noexcept {
if (IsImmediate())
return false;
return HostLocIsSpill(*reg_alloc.ValueLocation(value.GetInst()));
}
RegAlloc::RegAlloc(boost::container::static_vector<HostLoc, 28> gpr_order, boost::container::static_vector<HostLoc, 28> xmm_order) noexcept
RegAlloc::RegAlloc(BlockOfCode* code, boost::container::static_vector<HostLoc, 28> gpr_order, boost::container::static_vector<HostLoc, 28> xmm_order) noexcept
: gpr_order(gpr_order),
xmm_order(xmm_order)
xmm_order(xmm_order),
code(code)
{}
//static std::uint64_t Zfncwjkrt_blockOfCodeShim = 0;
RegAlloc::ArgumentInfo RegAlloc::GetArgumentInfo(const IR::Inst* inst) noexcept {
ArgumentInfo ret{
Argument{},
Argument{},
Argument{},
Argument{}
};
ArgumentInfo ret{Argument{*this}, Argument{*this}, Argument{*this}, Argument{*this}};
for (size_t i = 0; i < inst->NumArgs(); i++) {
const auto arg = inst->GetArg(i);
ret[i].value = arg;
@@ -221,34 +228,34 @@ void RegAlloc::RegisterPseudoOperation(const IR::Inst* inst) noexcept {
}
}
Xbyak::Reg64 RegAlloc::UseScratchGpr(BlockOfCode& code, Argument& arg) noexcept {
Xbyak::Reg64 RegAlloc::UseScratchGpr(Argument& arg) noexcept {
ASSERT(!arg.allocated);
arg.allocated = true;
return HostLocToReg64(UseScratchImpl(code, arg.value, gpr_order));
return HostLocToReg64(UseScratchImpl(arg.value, gpr_order));
}
Xbyak::Xmm RegAlloc::UseScratchXmm(BlockOfCode& code, Argument& arg) noexcept {
Xbyak::Xmm RegAlloc::UseScratchXmm(Argument& arg) noexcept {
ASSERT(!arg.allocated);
arg.allocated = true;
return HostLocToXmm(UseScratchImpl(code, arg.value, xmm_order));
return HostLocToXmm(UseScratchImpl(arg.value, xmm_order));
}
void RegAlloc::UseScratch(BlockOfCode& code, Argument& arg, HostLoc host_loc) noexcept {
void RegAlloc::UseScratch(Argument& arg, HostLoc host_loc) noexcept {
ASSERT(!arg.allocated);
arg.allocated = true;
UseScratchImpl(code, arg.value, {host_loc});
UseScratchImpl(arg.value, {host_loc});
}
void RegAlloc::DefineValue(BlockOfCode& code, IR::Inst* inst, const Xbyak::Reg& reg) noexcept {
void RegAlloc::DefineValue(IR::Inst* inst, const Xbyak::Reg& reg) noexcept {
ASSERT(reg.getKind() == Xbyak::Operand::XMM || reg.getKind() == Xbyak::Operand::REG);
const auto hostloc = static_cast<HostLoc>(reg.getIdx() + static_cast<size_t>(reg.getKind() == Xbyak::Operand::XMM ? HostLoc::XMM0 : HostLoc::RAX));
DefineValueImpl(code, inst, hostloc);
DefineValueImpl(inst, hostloc);
}
void RegAlloc::DefineValue(BlockOfCode& code, IR::Inst* inst, Argument& arg) noexcept {
void RegAlloc::DefineValue(IR::Inst* inst, Argument& arg) noexcept {
ASSERT(!arg.allocated);
arg.allocated = true;
DefineValueImpl(code, inst, arg.value);
DefineValueImpl(inst, arg.value);
}
void RegAlloc::Release(const Xbyak::Reg& reg) noexcept {
@@ -257,9 +264,9 @@ void RegAlloc::Release(const Xbyak::Reg& reg) noexcept {
LocInfo(hostloc).ReleaseOne();
}
HostLoc RegAlloc::UseImpl(BlockOfCode& code, IR::Value use_value, const boost::container::static_vector<HostLoc, 28>& desired_locations) noexcept {
HostLoc RegAlloc::UseImpl(IR::Value use_value, const boost::container::static_vector<HostLoc, 28>& desired_locations) noexcept {
if (use_value.IsImmediate()) {
return LoadImmediate(code, use_value, ScratchImpl(code, desired_locations));
return LoadImmediate(use_value, ScratchImpl(desired_locations));
}
const auto* use_inst = use_value.GetInst();
@@ -273,25 +280,25 @@ HostLoc RegAlloc::UseImpl(BlockOfCode& code, IR::Value use_value, const boost::c
}
if (LocInfo(current_location).IsLocked()) {
return UseScratchImpl(code, use_value, desired_locations);
return UseScratchImpl(use_value, desired_locations);
}
const HostLoc destination_location = SelectARegister(desired_locations);
if (max_bit_width > HostLocBitWidth(destination_location)) {
return UseScratchImpl(code, use_value, desired_locations);
return UseScratchImpl(use_value, desired_locations);
} else if (CanExchange(destination_location, current_location)) {
Exchange(code, destination_location, current_location);
Exchange(destination_location, current_location);
} else {
MoveOutOfTheWay(code, destination_location);
Move(code, destination_location, current_location);
MoveOutOfTheWay(destination_location);
Move(destination_location, current_location);
}
LocInfo(destination_location).ReadLock();
return destination_location;
}
HostLoc RegAlloc::UseScratchImpl(BlockOfCode& code, IR::Value use_value, const boost::container::static_vector<HostLoc, 28>& desired_locations) noexcept {
HostLoc RegAlloc::UseScratchImpl(IR::Value use_value, const boost::container::static_vector<HostLoc, 28>& desired_locations) noexcept {
if (use_value.IsImmediate()) {
return LoadImmediate(code, use_value, ScratchImpl(code, desired_locations));
return LoadImmediate(use_value, ScratchImpl(desired_locations));
}
const auto* use_inst = use_value.GetInst();
@@ -301,7 +308,7 @@ HostLoc RegAlloc::UseScratchImpl(BlockOfCode& code, IR::Value use_value, const b
const bool can_use_current_location = std::find(desired_locations.begin(), desired_locations.end(), current_location) != desired_locations.end();
if (can_use_current_location && !LocInfo(current_location).IsLocked()) {
if (!LocInfo(current_location).IsLastUse()) {
MoveOutOfTheWay(code, current_location);
MoveOutOfTheWay(current_location);
} else {
LocInfo(current_location).SetLastUse();
}
@@ -310,22 +317,20 @@ HostLoc RegAlloc::UseScratchImpl(BlockOfCode& code, IR::Value use_value, const b
}
const HostLoc destination_location = SelectARegister(desired_locations);
MoveOutOfTheWay(code, destination_location);
CopyToScratch(code, bit_width, destination_location, current_location);
MoveOutOfTheWay(destination_location);
CopyToScratch(bit_width, destination_location, current_location);
LocInfo(destination_location).WriteLock();
return destination_location;
}
HostLoc RegAlloc::ScratchImpl(BlockOfCode& code, const boost::container::static_vector<HostLoc, 28>& desired_locations) noexcept {
HostLoc RegAlloc::ScratchImpl(const boost::container::static_vector<HostLoc, 28>& desired_locations) noexcept {
const HostLoc location = SelectARegister(desired_locations);
MoveOutOfTheWay(code, location);
MoveOutOfTheWay(location);
LocInfo(location).WriteLock();
return location;
}
void RegAlloc::HostCall(
BlockOfCode& code,
IR::Inst* result_def,
void RegAlloc::HostCall(IR::Inst* result_def,
const std::optional<Argument::copyable_reference> arg0,
const std::optional<Argument::copyable_reference> arg1,
const std::optional<Argument::copyable_reference> arg2,
@@ -343,20 +348,20 @@ void RegAlloc::HostCall(
return ret;
}();
ScratchGpr(code, ABI_RETURN);
if (result_def)
DefineValueImpl(code, result_def, ABI_RETURN);
ScratchGpr(ABI_RETURN);
if (result_def) {
DefineValueImpl(result_def, ABI_RETURN);
}
for (size_t i = 0; i < args.size(); i++) {
if (args[i]) {
UseScratch(code, *args[i], args_hostloc[i]);
UseScratch(*args[i], args_hostloc[i]);
} else {
ScratchGpr(code, args_hostloc[i]); // TODO: Force spill
ScratchGpr(args_hostloc[i]); // TODO: Force spill
}
}
// Must match with with ScratchImpl
for (auto const gpr : other_caller_save) {
MoveOutOfTheWay(code, gpr);
MoveOutOfTheWay(gpr);
LocInfo(gpr).WriteLock();
}
for (size_t i = 0; i < args.size(); i++) {
@@ -365,13 +370,13 @@ void RegAlloc::HostCall(
const Xbyak::Reg64 reg = HostLocToReg64(args_hostloc[i]);
switch (args[i]->get().GetType()) {
case IR::Type::U8:
code.movzx(reg.cvt32(), reg.cvt8());
code->movzx(reg.cvt32(), reg.cvt8());
break;
case IR::Type::U16:
code.movzx(reg.cvt32(), reg.cvt16());
code->movzx(reg.cvt32(), reg.cvt16());
break;
case IR::Type::U32:
code.mov(reg.cvt32(), reg.cvt32());
code->mov(reg.cvt32(), reg.cvt32());
break;
case IR::Type::U64:
break; //no op
@@ -382,18 +387,18 @@ void RegAlloc::HostCall(
}
}
void RegAlloc::AllocStackSpace(BlockOfCode& code, const size_t stack_space) noexcept {
void RegAlloc::AllocStackSpace(const size_t stack_space) noexcept {
ASSERT(stack_space < size_t((std::numeric_limits<s32>::max)()));
ASSERT(reserved_stack_space == 0);
reserved_stack_space = stack_space;
code.sub(code.rsp, u32(stack_space));
code->sub(code->rsp, u32(stack_space));
}
void RegAlloc::ReleaseStackSpace(BlockOfCode& code, const size_t stack_space) noexcept {
void RegAlloc::ReleaseStackSpace(const size_t stack_space) noexcept {
ASSERT(stack_space < size_t((std::numeric_limits<s32>::max)()));
ASSERT(reserved_stack_space == stack_space);
reserved_stack_space = 0;
code.add(code.rsp, u32(stack_space));
code->add(code->rsp, u32(stack_space));
}
HostLoc RegAlloc::SelectARegister(const boost::container::static_vector<HostLoc, 28>& desired_locations) const noexcept {
@@ -453,75 +458,92 @@ HostLoc RegAlloc::SelectARegister(const boost::container::static_vector<HostLoc,
return *it_final;
}
std::optional<HostLoc> RegAlloc::ValueLocation(const IR::Inst* value) const noexcept {
for (size_t i = 0; i < hostloc_info.size(); i++)
if (hostloc_info[i].ContainsValue(value))
return HostLoc(i);
return std::nullopt;
}
void RegAlloc::DefineValueImpl(BlockOfCode& code, IR::Inst* def_inst, HostLoc host_loc) noexcept {
void RegAlloc::DefineValueImpl(IR::Inst* def_inst, HostLoc host_loc) noexcept {
ASSERT(!ValueLocation(def_inst) && "def_inst has already been defined");
LocInfo(host_loc).AddValue(def_inst);
}
void RegAlloc::DefineValueImpl(BlockOfCode& code, IR::Inst* def_inst, const IR::Value& use_inst) noexcept {
void RegAlloc::DefineValueImpl(IR::Inst* def_inst, const IR::Value& use_inst) noexcept {
ASSERT(!ValueLocation(def_inst) && "def_inst has already been defined");
if (use_inst.IsImmediate()) {
const HostLoc location = ScratchImpl(code, gpr_order);
DefineValueImpl(code, def_inst, location);
LoadImmediate(code, use_inst, location);
const HostLoc location = ScratchImpl(gpr_order);
DefineValueImpl(def_inst, location);
LoadImmediate(use_inst, location);
return;
}
ASSERT(ValueLocation(use_inst.GetInst()) && "use_inst must already be defined");
const HostLoc location = *ValueLocation(use_inst.GetInst());
DefineValueImpl(code, def_inst, location);
DefineValueImpl(def_inst, location);
}
void RegAlloc::Move(BlockOfCode& code, HostLoc to, HostLoc from) noexcept {
HostLoc RegAlloc::LoadImmediate(IR::Value imm, HostLoc host_loc) noexcept {
ASSERT(imm.IsImmediate() && "imm is not an immediate");
if (HostLocIsGPR(host_loc)) {
const Xbyak::Reg64 reg = HostLocToReg64(host_loc);
const u64 imm_value = imm.GetImmediateAsU64();
if (imm_value == 0) {
code->xor_(reg.cvt32(), reg.cvt32());
} else {
code->mov(reg, imm_value);
}
} else if (HostLocIsXMM(host_loc)) {
const Xbyak::Xmm reg = HostLocToXmm(host_loc);
const u64 imm_value = imm.GetImmediateAsU64();
if (imm_value == 0) {
MAYBE_AVX(xorps, reg, reg);
} else {
MAYBE_AVX(movaps, reg, code->Const(code->xword, imm_value));
}
} else {
UNREACHABLE();
}
return host_loc;
}
void RegAlloc::Move(HostLoc to, HostLoc from) noexcept {
const size_t bit_width = LocInfo(from).GetMaxBitWidth();
ASSERT(LocInfo(to).IsEmpty() && !LocInfo(from).IsLocked());
ASSERT(bit_width <= HostLocBitWidth(to));
ASSERT(!LocInfo(from).IsEmpty() && "Mov eliminated");
EmitMove(code, bit_width, to, from);
EmitMove(bit_width, to, from);
LocInfo(to) = std::exchange(LocInfo(from), {});
}
void RegAlloc::CopyToScratch(BlockOfCode& code, size_t bit_width, HostLoc to, HostLoc from) noexcept {
void RegAlloc::CopyToScratch(size_t bit_width, HostLoc to, HostLoc from) noexcept {
ASSERT(LocInfo(to).IsEmpty() && !LocInfo(from).IsEmpty());
EmitMove(code, bit_width, to, from);
EmitMove(bit_width, to, from);
}
void RegAlloc::Exchange(BlockOfCode& code, HostLoc a, HostLoc b) noexcept {
void RegAlloc::Exchange(HostLoc a, HostLoc b) noexcept {
ASSERT(!LocInfo(a).IsLocked() && !LocInfo(b).IsLocked());
ASSERT(LocInfo(a).GetMaxBitWidth() <= HostLocBitWidth(b));
ASSERT(LocInfo(b).GetMaxBitWidth() <= HostLocBitWidth(a));
if (LocInfo(a).IsEmpty()) {
Move(code, a, b);
Move(a, b);
} else if (LocInfo(b).IsEmpty()) {
Move(code, b, a);
Move(b, a);
} else {
EmitExchange(code, a, b);
EmitExchange(a, b);
std::swap(LocInfo(a), LocInfo(b));
}
}
void RegAlloc::MoveOutOfTheWay(BlockOfCode& code, HostLoc reg) noexcept {
void RegAlloc::MoveOutOfTheWay(HostLoc reg) noexcept {
ASSERT(!LocInfo(reg).IsLocked());
if (!LocInfo(reg).IsEmpty()) {
SpillRegister(code, reg);
SpillRegister(reg);
}
}
void RegAlloc::SpillRegister(BlockOfCode& code, HostLoc loc) noexcept {
void RegAlloc::SpillRegister(HostLoc loc) noexcept {
ASSERT(HostLocIsRegister(loc) && "Only registers can be spilled");
ASSERT(!LocInfo(loc).IsEmpty() && "There is no need to spill unoccupied registers");
ASSERT(!LocInfo(loc).IsLocked() && "Registers that have been allocated must not be spilt");
auto const new_loc = FindFreeSpill(HostLocIsXMM(loc));
Move(code, new_loc, loc);
Move(new_loc, loc);
}
HostLoc RegAlloc::FindFreeSpill(bool is_xmm) const noexcept {
@@ -546,39 +568,9 @@ HostLoc RegAlloc::FindFreeSpill(bool is_xmm) const noexcept {
if (const auto loc = HostLoc(i); LocInfo(loc).IsEmpty())
return loc;
UNREACHABLE();
}
};
#define MAYBE_AVX(OPCODE, ...) \
[&] { \
if (code.HasHostFeature(HostFeature::AVX)) code.v##OPCODE(__VA_ARGS__); \
else code.OPCODE(__VA_ARGS__); \
}()
HostLoc RegAlloc::LoadImmediate(BlockOfCode& code, IR::Value imm, HostLoc host_loc) noexcept {
ASSERT(imm.IsImmediate() && "imm is not an immediate");
if (HostLocIsGPR(host_loc)) {
const Xbyak::Reg64 reg = HostLocToReg64(host_loc);
const u64 imm_value = imm.GetImmediateAsU64();
if (imm_value == 0) {
code.xor_(reg.cvt32(), reg.cvt32());
} else {
code.mov(reg, imm_value);
}
} else if (HostLocIsXMM(host_loc)) {
const Xbyak::Xmm reg = HostLocToXmm(host_loc);
const u64 imm_value = imm.GetImmediateAsU64();
if (imm_value == 0) {
MAYBE_AVX(xorps, reg, reg);
} else {
MAYBE_AVX(movaps, reg, code.Const(code.xword, imm_value));
}
} else {
UNREACHABLE();
}
return host_loc;
}
void RegAlloc::EmitMove(BlockOfCode& code, const size_t bit_width, const HostLoc to, const HostLoc from) noexcept {
void RegAlloc::EmitMove(const size_t bit_width, const HostLoc to, const HostLoc from) noexcept {
auto const spill_to_op_arg_helper = [&](HostLoc loc, size_t reserved_stack_space) {
ASSERT(HostLocIsSpill(loc));
size_t i = size_t(loc) - size_t(HostLoc::FirstSpill);
@@ -593,9 +585,9 @@ void RegAlloc::EmitMove(BlockOfCode& code, const size_t bit_width, const HostLoc
} else if (HostLocIsGPR(to) && HostLocIsGPR(from)) {
ASSERT(bit_width != 128);
if (bit_width == 64) {
code.mov(HostLocToReg64(to), HostLocToReg64(from));
code->mov(HostLocToReg64(to), HostLocToReg64(from));
} else {
code.mov(HostLocToReg64(to).cvt32(), HostLocToReg64(from).cvt32());
code->mov(HostLocToReg64(to).cvt32(), HostLocToReg64(from).cvt32());
}
} else if (HostLocIsXMM(to) && HostLocIsGPR(from)) {
ASSERT(bit_width != 128);
@@ -650,26 +642,25 @@ void RegAlloc::EmitMove(BlockOfCode& code, const size_t bit_width, const HostLoc
} else if (HostLocIsGPR(to) && HostLocIsSpill(from)) {
ASSERT(bit_width != 128);
if (bit_width == 64) {
code.mov(HostLocToReg64(to), Xbyak::util::qword[spill_to_op_arg_helper(from, reserved_stack_space)]);
code->mov(HostLocToReg64(to), Xbyak::util::qword[spill_to_op_arg_helper(from, reserved_stack_space)]);
} else {
code.mov(HostLocToReg64(to).cvt32(), Xbyak::util::dword[spill_to_op_arg_helper(from, reserved_stack_space)]);
code->mov(HostLocToReg64(to).cvt32(), Xbyak::util::dword[spill_to_op_arg_helper(from, reserved_stack_space)]);
}
} else if (HostLocIsSpill(to) && HostLocIsGPR(from)) {
ASSERT(bit_width != 128);
if (bit_width == 64) {
code.mov(Xbyak::util::qword[spill_to_op_arg_helper(to, reserved_stack_space)], HostLocToReg64(from));
code->mov(Xbyak::util::qword[spill_to_op_arg_helper(to, reserved_stack_space)], HostLocToReg64(from));
} else {
code.mov(Xbyak::util::dword[spill_to_op_arg_helper(to, reserved_stack_space)], HostLocToReg64(from).cvt32());
code->mov(Xbyak::util::dword[spill_to_op_arg_helper(to, reserved_stack_space)], HostLocToReg64(from).cvt32());
}
} else {
UNREACHABLE();
}
}
#undef MAYBE_AVX
void RegAlloc::EmitExchange(BlockOfCode& code, const HostLoc a, const HostLoc b) noexcept {
void RegAlloc::EmitExchange(const HostLoc a, const HostLoc b) noexcept {
ASSERT(HostLocIsGPR(a) && HostLocIsGPR(b) && "Exchanging XMM registers is uneeded OR invalid emit");
code.xchg(HostLocToReg64(a), HostLocToReg64(b));
code->xchg(HostLocToReg64(a), HostLocToReg64(b));
}
} // namespace Dynarmic::Backend::X64

View File

@@ -81,7 +81,7 @@ public:
return 1 << max_bit_width;
}
void AddValue(IR::Inst* inst) noexcept;
void EmitVerboseDebuggingOutput(BlockOfCode& code, size_t host_loc_index) const noexcept;
void EmitVerboseDebuggingOutput(BlockOfCode* code, size_t host_loc_index) const noexcept;
private:
//non trivial
boost::container::small_vector<IR::Inst*, 3> values; //24
@@ -129,15 +129,16 @@ public:
IR::AccType GetImmediateAccType() const noexcept;
/// Is this value currently in a GPR?
bool IsInGpr(RegAlloc& reg_alloc) const noexcept;
bool IsInXmm(RegAlloc& reg_alloc) const noexcept;
bool IsInMemory(RegAlloc& reg_alloc) const noexcept;
bool IsInGpr() const noexcept;
bool IsInXmm() const noexcept;
bool IsInMemory() const noexcept;
private:
friend class RegAlloc;
explicit Argument() {}
explicit Argument(RegAlloc& reg_alloc) : reg_alloc(reg_alloc) {}
//data
IR::Value value; //8
RegAlloc& reg_alloc; //8
bool allocated = false; //1
};
@@ -145,57 +146,55 @@ class RegAlloc final {
public:
using ArgumentInfo = std::array<Argument, IR::max_arg_count>;
RegAlloc() noexcept = default;
RegAlloc(boost::container::static_vector<HostLoc, 28> gpr_order, boost::container::static_vector<HostLoc, 28> xmm_order) noexcept;
RegAlloc(BlockOfCode* code, boost::container::static_vector<HostLoc, 28> gpr_order, boost::container::static_vector<HostLoc, 28> xmm_order) noexcept;
ArgumentInfo GetArgumentInfo(const IR::Inst* inst) noexcept;
void RegisterPseudoOperation(const IR::Inst* inst) noexcept;
inline bool IsValueLive(const IR::Inst* inst) const noexcept {
return !!ValueLocation(inst);
}
inline Xbyak::Reg64 UseGpr(BlockOfCode& code, Argument& arg) noexcept {
inline Xbyak::Reg64 UseGpr(Argument& arg) noexcept {
ASSERT(!arg.allocated);
arg.allocated = true;
return HostLocToReg64(UseImpl(code, arg.value, gpr_order));
return HostLocToReg64(UseImpl(arg.value, gpr_order));
}
inline Xbyak::Xmm UseXmm(BlockOfCode& code, Argument& arg) noexcept {
inline Xbyak::Xmm UseXmm(Argument& arg) noexcept {
ASSERT(!arg.allocated);
arg.allocated = true;
return HostLocToXmm(UseImpl(code, arg.value, xmm_order));
return HostLocToXmm(UseImpl(arg.value, xmm_order));
}
inline OpArg UseOpArg(BlockOfCode& code, Argument& arg) noexcept {
return UseGpr(code, arg);
inline OpArg UseOpArg(Argument& arg) noexcept {
return UseGpr(arg);
}
inline void Use(BlockOfCode& code, Argument& arg, const HostLoc host_loc) noexcept {
inline void Use(Argument& arg, const HostLoc host_loc) noexcept {
ASSERT(!arg.allocated);
arg.allocated = true;
UseImpl(code, arg.value, {host_loc});
UseImpl(arg.value, {host_loc});
}
Xbyak::Reg64 UseScratchGpr(BlockOfCode& code, Argument& arg) noexcept;
Xbyak::Xmm UseScratchXmm(BlockOfCode& code, Argument& arg) noexcept;
void UseScratch(BlockOfCode& code, Argument& arg, HostLoc host_loc) noexcept;
Xbyak::Reg64 UseScratchGpr(Argument& arg) noexcept;
Xbyak::Xmm UseScratchXmm(Argument& arg) noexcept;
void UseScratch(Argument& arg, HostLoc host_loc) noexcept;
void DefineValue(BlockOfCode& code, IR::Inst* inst, const Xbyak::Reg& reg) noexcept;
void DefineValue(BlockOfCode& code, IR::Inst* inst, Argument& arg) noexcept;
void DefineValue(IR::Inst* inst, const Xbyak::Reg& reg) noexcept;
void DefineValue(IR::Inst* inst, Argument& arg) noexcept;
void Release(const Xbyak::Reg& reg) noexcept;
inline Xbyak::Reg64 ScratchGpr(BlockOfCode& code) noexcept {
return HostLocToReg64(ScratchImpl(code, gpr_order));
inline Xbyak::Reg64 ScratchGpr() noexcept {
return HostLocToReg64(ScratchImpl(gpr_order));
}
inline Xbyak::Reg64 ScratchGpr(BlockOfCode& code, const HostLoc desired_location) noexcept {
return HostLocToReg64(ScratchImpl(code, {desired_location}));
inline Xbyak::Reg64 ScratchGpr(const HostLoc desired_location) noexcept {
return HostLocToReg64(ScratchImpl({desired_location}));
}
inline Xbyak::Xmm ScratchXmm(BlockOfCode& code) noexcept {
return HostLocToXmm(ScratchImpl(code, xmm_order));
inline Xbyak::Xmm ScratchXmm() noexcept {
return HostLocToXmm(ScratchImpl(xmm_order));
}
inline Xbyak::Xmm ScratchXmm(BlockOfCode& code, HostLoc desired_location) noexcept {
return HostLocToXmm(ScratchImpl(code, {desired_location}));
inline Xbyak::Xmm ScratchXmm(HostLoc desired_location) noexcept {
return HostLocToXmm(ScratchImpl({desired_location}));
}
void HostCall(
BlockOfCode& code,
IR::Inst* result_def = nullptr,
void HostCall(IR::Inst* result_def = nullptr,
const std::optional<Argument::copyable_reference> arg0 = {},
const std::optional<Argument::copyable_reference> arg1 = {},
const std::optional<Argument::copyable_reference> arg2 = {},
@@ -203,56 +202,67 @@ public:
) noexcept;
// TODO: Values in host flags
void AllocStackSpace(BlockOfCode& code, const size_t stack_space) noexcept;
void ReleaseStackSpace(BlockOfCode& code, const size_t stack_space) noexcept;
void AllocStackSpace(const size_t stack_space) noexcept;
void ReleaseStackSpace(const size_t stack_space) noexcept;
inline void EndOfAllocScope() noexcept {
for (auto& iter : hostloc_info)
for (auto& iter : hostloc_info) {
iter.ReleaseAll();
}
}
inline void AssertNoMoreUses() noexcept {
ASSERT(std::all_of(hostloc_info.begin(), hostloc_info.end(), [](const auto& i) noexcept { return i.IsEmpty(); }));
}
inline void EmitVerboseDebuggingOutput(BlockOfCode& code) noexcept {
for (size_t i = 0; i < hostloc_info.size(); i++)
inline void EmitVerboseDebuggingOutput() noexcept {
for (size_t i = 0; i < hostloc_info.size(); i++) {
hostloc_info[i].EmitVerboseDebuggingOutput(code, i);
}
}
private:
friend struct Argument;
HostLoc SelectARegister(const boost::container::static_vector<HostLoc, 28>& desired_locations) const noexcept;
std::optional<HostLoc> ValueLocation(const IR::Inst* value) const noexcept;
HostLoc UseImpl(BlockOfCode& code, IR::Value use_value, const boost::container::static_vector<HostLoc, 28>& desired_locations) noexcept;
HostLoc UseScratchImpl(BlockOfCode& code, IR::Value use_value, const boost::container::static_vector<HostLoc, 28>& desired_locations) noexcept;
HostLoc ScratchImpl(BlockOfCode& code, const boost::container::static_vector<HostLoc, 28>& desired_locations) noexcept;
void DefineValueImpl(BlockOfCode& code, IR::Inst* def_inst, HostLoc host_loc) noexcept;
void DefineValueImpl(BlockOfCode& code, IR::Inst* def_inst, const IR::Value& use_inst) noexcept;
inline std::optional<HostLoc> ValueLocation(const IR::Inst* value) const noexcept {
for (size_t i = 0; i < hostloc_info.size(); i++) {
if (hostloc_info[i].ContainsValue(value)) {
return HostLoc(i);
}
}
return std::nullopt;
}
HostLoc LoadImmediate(BlockOfCode& code, IR::Value imm, HostLoc host_loc) noexcept;
void Move(BlockOfCode& code, HostLoc to, HostLoc from) noexcept;
void CopyToScratch(BlockOfCode& code, size_t bit_width, HostLoc to, HostLoc from) noexcept;
void Exchange(BlockOfCode& code, HostLoc a, HostLoc b) noexcept;
void MoveOutOfTheWay(BlockOfCode& code, HostLoc reg) noexcept;
HostLoc UseImpl(IR::Value use_value, const boost::container::static_vector<HostLoc, 28>& desired_locations) noexcept;
HostLoc UseScratchImpl(IR::Value use_value, const boost::container::static_vector<HostLoc, 28>& desired_locations) noexcept;
HostLoc ScratchImpl(const boost::container::static_vector<HostLoc, 28>& desired_locations) noexcept;
void DefineValueImpl(IR::Inst* def_inst, HostLoc host_loc) noexcept;
void DefineValueImpl(IR::Inst* def_inst, const IR::Value& use_inst) noexcept;
void SpillRegister(BlockOfCode& code, HostLoc loc) noexcept;
HostLoc LoadImmediate(IR::Value imm, HostLoc host_loc) noexcept;
void Move(HostLoc to, HostLoc from) noexcept;
void CopyToScratch(size_t bit_width, HostLoc to, HostLoc from) noexcept;
void Exchange(HostLoc a, HostLoc b) noexcept;
void MoveOutOfTheWay(HostLoc reg) noexcept;
void SpillRegister(HostLoc loc) noexcept;
HostLoc FindFreeSpill(bool is_xmm) const noexcept;
inline HostLocInfo& LocInfo(const HostLoc loc) noexcept {
ASSERT(loc != HostLoc::RSP && loc != ABI_JIT_PTR);
return hostloc_info[size_t(loc)];
return hostloc_info[static_cast<size_t>(loc)];
}
inline const HostLocInfo& LocInfo(const HostLoc loc) const noexcept {
ASSERT(loc != HostLoc::RSP && loc != ABI_JIT_PTR);
return hostloc_info[size_t(loc)];
return hostloc_info[static_cast<size_t>(loc)];
}
void EmitMove(BlockOfCode& code, const size_t bit_width, const HostLoc to, const HostLoc from) noexcept;
void EmitExchange(BlockOfCode& code, const HostLoc a, const HostLoc b) noexcept;
void EmitMove(const size_t bit_width, const HostLoc to, const HostLoc from) noexcept;
void EmitExchange(const HostLoc a, const HostLoc b) noexcept;
//data
alignas(64) boost::container::static_vector<HostLoc, 28> gpr_order;
alignas(64) boost::container::static_vector<HostLoc, 28> xmm_order;
alignas(64) std::array<HostLocInfo, NonSpillHostLocCount + SpillCount> hostloc_info;
BlockOfCode* code = nullptr;
size_t reserved_stack_space = 0;
};
// Ensure a cache line (or less) is used, this is primordial

View File

@@ -100,14 +100,9 @@ bool Value::IsEmpty() const noexcept {
}
bool Value::IsImmediate() const noexcept {
IR::Type current_type = type;
IR::Inst const* current_inst = inner.inst;
while (current_type == Type::Opaque && current_inst->GetOpcode() == Opcode::Identity) {
Value const& arg = current_inst->GetArg(0);
current_type = arg.type;
current_inst = arg.inner.inst;
}
return current_type != Type::Opaque;
if (IsIdentity())
return inner.inst->GetArg(0).IsImmediate();
return type != Type::Opaque;
}
Type Value::GetType() const noexcept {

View File

@@ -308,6 +308,16 @@ void Config::ReadDebuggingValues() {
EndGroup();
}
#ifdef __unix__
void Config::ReadLinuxValues() {
BeginGroup(Settings::TranslateCategory(Settings::Category::Linux));
ReadCategory(Settings::Category::Linux);
EndGroup();
}
#endif
void Config::ReadServiceValues() {
BeginGroup(Settings::TranslateCategory(Settings::Category::Services));
@@ -424,6 +434,9 @@ void Config::ReadValues() {
ReadControlValues();
ReadCoreValues();
ReadCpuValues();
#ifdef __unix__
ReadLinuxValues();
#endif
ReadRendererValues();
ReadAudioValues();
ReadSystemValues();
@@ -524,6 +537,9 @@ void Config::SaveValues() {
SaveControlValues();
SaveCoreValues();
SaveCpuValues();
#ifdef __unix__
SaveLinuxValues();
#endif
SaveRendererValues();
SaveAudioValues();
SaveSystemValues();
@@ -600,6 +616,16 @@ void Config::SaveDebuggingValues() {
EndGroup();
}
#ifdef __unix__
void Config::SaveLinuxValues() {
BeginGroup(Settings::TranslateCategory(Settings::Category::Linux));
WriteCategory(Settings::Category::Linux);
EndGroup();
}
#endif
void Config::SaveNetworkValues() {
BeginGroup(Settings::TranslateCategory(Settings::Category::Services));

View File

@@ -84,6 +84,9 @@ protected:
void ReadCoreValues();
void ReadDataStorageValues();
void ReadDebuggingValues();
#ifdef __unix__
void ReadLinuxValues();
#endif
void ReadServiceValues();
void ReadDisabledAddOnValues();
void ReadMiscellaneousValues();
@@ -116,6 +119,9 @@ protected:
void SaveCoreValues();
void SaveDataStorageValues();
void SaveDebuggingValues();
#ifdef __unix__
void SaveLinuxValues();
#endif
void SaveNetworkValues();
void SaveDisabledAddOnValues();
void SaveMiscellaneousValues();

View File

@@ -49,13 +49,16 @@ u64 ClearDir(DataDir dir, const std::string &user_id)
return result;
}
std::string ReadableBytesSize(u64 size) noexcept {
std::array<std::string_view, 6> const units{"B", "KB", "MB", "GB", "TB", "PB"};
u64 const base = 1000;
if (size == 0)
const std::string ReadableBytesSize(u64 size)
{
static constexpr std::array units{"B", "KiB", "MiB", "GiB", "TiB", "PiB"};
if (size == 0) {
return "0 B";
auto const digit_groups = std::min<u64>(u64(std::log10(size) / std::log10(base)), u64(units.size()));
return fmt::format("{:.1f} {}", size / std::pow(base, digit_groups), units[digit_groups]);
}
const int digit_groups = (std::min) (static_cast<int>(std::log10(size) / std::log10(1024)),
static_cast<int>(units.size()));
return fmt::format("{:.1f} {}", size / std::pow(1024, digit_groups), units[digit_groups]);
}
u64 DataDirSize(DataDir dir)

View File

@@ -16,7 +16,8 @@ const std::filesystem::path GetDataDir(DataDir dir, const std::string &user_id =
const std::string GetDataDirString(DataDir dir, const std::string &user_id = "");
u64 ClearDir(DataDir dir, const std::string &user_id = "");
std::string ReadableBytesSize(u64 size) noexcept;
const std::string ReadableBytesSize(u64 size);
u64 DataDirSize(DataDir dir);

View File

@@ -5,9 +5,6 @@ add_library(qt_common STATIC
qt_common.h
qt_common.cpp
gamemode.cpp
gamemode.h
config/uisettings.cpp
config/uisettings.h
config/qt_config.cpp
@@ -85,7 +82,6 @@ find_package(frozen)
target_link_libraries(qt_common PRIVATE core Qt6::Core Qt6::Concurrent SimpleIni::SimpleIni QuaZip::QuaZip)
target_link_libraries(qt_common PUBLIC frozen::frozen-headers)
target_link_libraries(qt_common PRIVATE gamemode::headers)
if (NOT APPLE AND ENABLE_OPENGL)
target_compile_definitions(qt_common PUBLIC HAS_OPENGL)

View File

@@ -277,8 +277,9 @@ std::unique_ptr<TranslationMap> InitializeTranslations(QObject* parent)
INSERT(Settings,
gpu_accuracy,
tr("GPU Accuracy:"),
tr("Controls the GPU emulation accuracy.\nMost games render fine with Fast or Balanced modes, but Accurate is still "
"required for some.\nParticles tend to only render correctly with Accurate mode."));
tr("Controls the GPU emulation accuracy.\nMost games render fine with Normal, but High is still "
"required for some.\nParticles tend to only render correctly with High "
"accuracy.\nExtreme should only be used as a last resort."));
INSERT(Settings,
dma_accuracy,
tr("DMA Accuracy:"),
@@ -432,10 +433,10 @@ std::unique_ptr<TranslationMap> InitializeTranslations(QObject* parent)
tr("Whether or not to check for updates upon startup."));
// Linux
INSERT(UISettings, enable_gamemode, tr("Enable Gamemode"), QString());
INSERT(Settings, enable_gamemode, tr("Enable Gamemode"), QString());
#ifdef __unix__
INSERT(UISettings, gui_force_x11, tr("Force X11 as Graphics Backend"), QString());
INSERT(UISettings, gui_hide_backend_warning, QString(), QString());
INSERT(Settings, gui_force_x11, tr("Force X11 as Graphics Backend"), QString());
INSERT(Settings, gui_hide_backend_warning, QString(), QString());
#endif
// Ui Debugging
@@ -506,9 +507,9 @@ std::unique_ptr<ComboboxTranslationMap> ComboboxEnumeration(QObject* parent)
}});
translations->insert({Settings::EnumMetadata<Settings::GpuAccuracy>::Index(),
{
PAIR(GpuAccuracy, Low, tr("Fast")),
PAIR(GpuAccuracy, Medium, tr("Balanced")),
PAIR(GpuAccuracy, High, tr("Accurate")),
PAIR(GpuAccuracy, Normal, tr("Normal")),
PAIR(GpuAccuracy, High, tr("High")),
PAIR(GpuAccuracy, Extreme, tr("Extreme")),
}});
translations->insert({Settings::EnumMetadata<Settings::DmaAccuracy>::Index(),
{

View File

@@ -61,9 +61,9 @@ static const std::map<Settings::ConsoleMode, QString> use_docked_mode_texts_map
};
static const std::map<Settings::GpuAccuracy, QString> gpu_accuracy_texts_map = {
{Settings::GpuAccuracy::Low, QStringLiteral(QT_TRANSLATE_NOOP("MainWindow", "Fast"))},
{Settings::GpuAccuracy::Medium, QStringLiteral(QT_TRANSLATE_NOOP("MainWindow", "Balanced"))},
{Settings::GpuAccuracy::High, QStringLiteral(QT_TRANSLATE_NOOP("MainWindow", "Accurate"))},
{Settings::GpuAccuracy::Normal, QStringLiteral(QT_TRANSLATE_NOOP("MainWindow", "Normal"))},
{Settings::GpuAccuracy::High, QStringLiteral(QT_TRANSLATE_NOOP("MainWindow", "High"))},
{Settings::GpuAccuracy::Extreme, QStringLiteral(QT_TRANSLATE_NOOP("MainWindow", "Extreme"))},
};
static const std::map<Settings::RendererBackend, QString> renderer_backend_texts_map = {

View File

@@ -142,19 +142,6 @@ struct Values {
Setting<bool> check_for_updates{linkage, true, "check_for_updates", Category::UiGeneral};
// Linux/MinGW may support (requires libdl support)
SwitchableSetting<bool> enable_gamemode{linkage,
#ifndef _MSC_VER
true,
#else
false,
#endif
"enable_gamemode", Category::UiGeneral};
#ifdef __unix__
SwitchableSetting<bool> gui_force_x11{linkage, false, "gui_force_x11", Category::UiGeneral};
Setting<bool> gui_hide_backend_warning{linkage, false, "gui_hide_backend_warning", Category::UiGeneral};
#endif
// Discord RPC
Setting<bool> enable_discord_presence{linkage, false, "enable_discord_presence", Category::Ui};

View File

@@ -1,52 +0,0 @@
// SPDX-FileCopyrightText: Copyright 2025 Eden Emulator Project
// SPDX-License-Identifier: GPL-3.0-or-later
// SPDX-FileCopyrightText: Copyright 2023 yuzu Emulator Project
// SPDX-License-Identifier: GPL-2.0-or-later
// While technically available on al *NIX platforms, Linux is only available
// as the primary target of libgamemode.so - so warnings are suppressed
#ifdef __unix__
#include <gamemode_client.h>
#endif
#include "qt_common/gamemode.h"
#include "common/logging/log.h"
#include "qt_common/config/uisettings.h"
namespace Common::FeralGamemode {
/// @brief Start the gamemode client
void Start() noexcept {
if (UISettings::values.enable_gamemode) {
#ifdef __unix__
if (gamemode_request_start() < 0) {
#ifdef __linux__
LOG_WARNING(Frontend, "{}", gamemode_error_string());
#else
LOG_INFO(Frontend, "{}", gamemode_error_string());
#endif
} else {
LOG_INFO(Frontend, "Done");
}
#endif
}
}
/// @brief Stop the gmemode client
void Stop() noexcept {
if (UISettings::values.enable_gamemode) {
#ifdef __unix__
if (gamemode_request_end() < 0) {
#ifdef __linux__
LOG_WARNING(Frontend, "{}", gamemode_error_string());
#else
LOG_INFO(Frontend, "{}", gamemode_error_string());
#endif
} else {
LOG_INFO(Frontend, "Done");
}
#endif
}
}
} // namespace Common::Linux

View File

@@ -1,14 +0,0 @@
// SPDX-FileCopyrightText: Copyright 2025 Eden Emulator Project
// SPDX-License-Identifier: GPL-3.0-or-later
// SPDX-FileCopyrightText: Copyright 2023 yuzu Emulator Project
// SPDX-License-Identifier: GPL-2.0-or-later
#pragma once
namespace Common::FeralGamemode {
void Start() noexcept;
void Stop() noexcept;
} // namespace Common::FeralGamemode

View File

@@ -103,7 +103,7 @@ bool DmaPusher::Step() {
ProcessCommands(headers);
};
const bool use_safe = Settings::IsDMALevelDefault() ? (Settings::IsGPULevelMedium() || Settings::IsGPULevelHigh()) : Settings::IsDMALevelSafe();
const bool use_safe = Settings::IsDMALevelDefault() ? Settings::IsGPULevelHigh() : Settings::IsDMALevelSafe();
if (use_safe) {
safe_process();

View File

@@ -72,33 +72,66 @@ public:
}
void SignalFence(std::function<void()>&& func) {
if constexpr (!can_async_check) {
TryReleasePendingFences<false>();
}
const bool delay_fence = Settings::IsGPULevelHigh();
#ifdef __ANDROID__
const bool use_optimized = Settings::values.early_release_fences.GetValue();
#else
constexpr bool use_optimized = false;
#endif
const bool should_flush = ShouldFlush();
CommitAsyncFlushes();
TFence new_fence = CreateFence(!should_flush);
if constexpr (can_async_check) {
guard.lock();
}
if (Settings::IsGPULevelLow() || (Settings::IsGPULevelMedium() && !should_flush)) {
func();
if (use_optimized) {
if (!delay_fence) {
TryReleasePendingFences<false>();
}
if (delay_fence) {
guard.lock();
uncommitted_operations.emplace_back(std::move(func));
}
} else {
uncommitted_operations.emplace_back(std::move(func));
}
if (!uncommitted_operations.empty()) {
pending_operations.emplace_back(std::move(uncommitted_operations));
uncommitted_operations.clear();
if constexpr (!can_async_check) {
TryReleasePendingFences<false>();
}
if constexpr (can_async_check) {
guard.lock();
}
if (delay_fence) {
uncommitted_operations.emplace_back(std::move(func));
}
}
pending_operations.emplace_back(std::move(uncommitted_operations));
QueueFence(new_fence);
if (!delay_fence) {
func();
}
fences.push(std::move(new_fence));
if (should_flush) {
rasterizer.FlushCommands();
}
if constexpr (can_async_check) {
guard.unlock();
cv.notify_all();
if (use_optimized) {
if (delay_fence) {
guard.unlock();
cv.notify_all();
}
} else {
if constexpr (can_async_check) {
guard.unlock();
cv.notify_all();
}
}
rasterizer.InvalidateGPUCache();
}

View File

@@ -79,8 +79,15 @@ void ThreadManager::FlushRegion(DAddr addr, u64 size) {
if (!is_async) {
// Always flush with synchronous GPU mode
PushCommand(FlushRegionCommand(addr, size));
return;
}
return;
if (!Settings::IsGPULevelExtreme()) {
return;
}
auto& gpu = system.GPU();
u64 fence = gpu.RequestFlush(addr, size);
TickGPU();
gpu.WaitForSyncOperation(fence);
}
void ThreadManager::TickGPU() {

View File

@@ -629,6 +629,9 @@ void RasterizerOpenGL::ReleaseFences(bool force) {
void RasterizerOpenGL::FlushAndInvalidateRegion(DAddr addr, u64 size,
VideoCommon::CacheType which) {
if (Settings::IsGPULevelExtreme()) {
FlushRegion(addr, size, which);
}
InvalidateRegion(addr, size, which);
}

View File

@@ -180,7 +180,145 @@ RendererVulkan::~RendererVulkan() {
void(device.GetLogical().WaitIdle());
}
#ifdef __ANDROID__
class BooleanSetting {
public:
// static BooleanSetting FRAME_SKIPPING;
static BooleanSetting FRAME_INTERPOLATION;
explicit BooleanSetting(bool initial_value = false) : value(initial_value) {}
[[nodiscard]] bool getBoolean() const {
return value;
}
void setBoolean(bool new_value) {
value = new_value;
}
private:
bool value;
};
// Initialize static members
// BooleanSetting BooleanSetting::FRAME_SKIPPING(false);
BooleanSetting BooleanSetting::FRAME_INTERPOLATION(false);
// extern "C" JNIEXPORT jboolean JNICALL
// Java_org_yuzu_yuzu_1emu_features_settings_model_BooleanSetting_isFrameSkippingEnabled(JNIEnv* env, jobject /* this */) {
// return static_cast<jboolean>(BooleanSetting::FRAME_SKIPPING.getBoolean());
// }
extern "C" JNIEXPORT jboolean JNICALL
Java_org_yuzu_yuzu_1emu_features_settings_model_BooleanSetting_isFrameInterpolationEnabled(JNIEnv* env, jobject /* this */) {
return static_cast<jboolean>(BooleanSetting::FRAME_INTERPOLATION.getBoolean());
}
void RendererVulkan::InterpolateFrames(Frame* prev_frame, Frame* interpolated_frame) {
if (!prev_frame || !interpolated_frame || !prev_frame->image || !interpolated_frame->image) {
return;
}
const auto& framebuffer_layout = render_window.GetFramebufferLayout();
// Fixed aggressive downscale (50%)
VkExtent2D dst_extent{
.width = framebuffer_layout.width / 2,
.height = framebuffer_layout.height / 2
};
// Check if we need to recreate the destination frame
bool needs_recreation = false; // Only recreate when necessary
if (!interpolated_frame->image_view) {
needs_recreation = true; // Need to create initially
} else {
// Check if dimensions have changed
if (interpolated_frame->framebuffer) {
needs_recreation = (framebuffer_layout.width / 2 != dst_extent.width) ||
(framebuffer_layout.height / 2 != dst_extent.height);
} else {
needs_recreation = true;
}
}
if (needs_recreation) {
interpolated_frame->image = CreateWrappedImage(memory_allocator, dst_extent, swapchain.GetImageViewFormat());
interpolated_frame->image_view = CreateWrappedImageView(device, interpolated_frame->image, swapchain.GetImageViewFormat());
interpolated_frame->framebuffer = blit_swapchain.CreateFramebuffer(
Layout::FramebufferLayout{dst_extent.width, dst_extent.height},
*interpolated_frame->image_view,
swapchain.GetImageViewFormat());
}
scheduler.RequestOutsideRenderPassOperationContext();
scheduler.Record([&](vk::CommandBuffer cmdbuf) {
// Transition images to transfer layouts
TransitionImageLayout(cmdbuf, *prev_frame->image, VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL);
TransitionImageLayout(cmdbuf, *interpolated_frame->image, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL);
// Perform the downscale blit
VkImageBlit blit_region{};
blit_region.srcSubresource = {VK_IMAGE_ASPECT_COLOR_BIT, 0, 0, 1};
blit_region.srcOffsets[0] = {0, 0, 0};
blit_region.srcOffsets[1] = {
static_cast<int32_t>(framebuffer_layout.width),
static_cast<int32_t>(framebuffer_layout.height),
1
};
blit_region.dstSubresource = {VK_IMAGE_ASPECT_COLOR_BIT, 0, 0, 1};
blit_region.dstOffsets[0] = {0, 0, 0};
blit_region.dstOffsets[1] = {
static_cast<int32_t>(dst_extent.width),
static_cast<int32_t>(dst_extent.height),
1
};
// Using the wrapper's BlitImage with proper parameters
cmdbuf.BlitImage(
*prev_frame->image, VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL,
*interpolated_frame->image, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL,
blit_region, VK_FILTER_NEAREST
);
// Transition back to general layout
TransitionImageLayout(cmdbuf, *prev_frame->image, VK_IMAGE_LAYOUT_GENERAL);
TransitionImageLayout(cmdbuf, *interpolated_frame->image, VK_IMAGE_LAYOUT_GENERAL);
});
}
#endif
void RendererVulkan::Composite(std::span<const Tegra::FramebufferConfig> framebuffers) {
#ifdef __ANDROID__
static int frame_counter = 0;
static int target_fps = 60; // Target FPS (30 or 60)
int frame_skip_threshold = 1;
bool frame_skipping = false; //BooleanSetting::FRAME_SKIPPING.getBoolean();
bool frame_interpolation = BooleanSetting::FRAME_INTERPOLATION.getBoolean();
#endif
if (framebuffers.empty()) {
return;
}
#ifdef __ANDROID__
if (frame_skipping) {
frame_skip_threshold = (target_fps == 30) ? 2 : 2;
}
frame_counter++;
if (frame_counter % frame_skip_threshold != 0) {
if (frame_interpolation && previous_frame) {
Frame* interpolated_frame = present_manager.GetRenderFrame();
InterpolateFrames(previous_frame, interpolated_frame);
blit_swapchain.DrawToFrame(rasterizer, interpolated_frame, framebuffers,
render_window.GetFramebufferLayout(), swapchain.GetImageCount(),
swapchain.GetImageViewFormat());
scheduler.Flush(*interpolated_frame->render_ready);
present_manager.Present(interpolated_frame);
}
return;
}
#endif
SCOPE_EXIT {
render_window.OnFrameDisplayed();
};

View File

@@ -80,9 +80,7 @@ void MasterSemaphore::Wait(u64 tick) {
if (!semaphore) {
// If we don't support timeline semaphores, wait for the value normally
std::unique_lock lk{free_mutex};
free_cv.wait(lk, [&] {
return gpu_tick.load(std::memory_order_acquire) >= tick;
});
free_cv.wait(lk, [&] { return gpu_tick.load(std::memory_order_relaxed) >= tick; });
return;
}
@@ -218,32 +216,15 @@ void MasterSemaphore::WaitThread(std::stop_token token) {
wait_queue.pop();
}
#ifdef ANDROID
VkResult status;
do {
status = fence.GetStatus();
if (status == VK_NOT_READY) {
std::this_thread::sleep_for(std::chrono::microseconds(100));
}
} while (status == VK_NOT_READY);
if (status == VK_SUCCESS) {
fence.Reset();
} else {
vk::Check(status);
continue;
}
#else
fence.Wait();
fence.Reset();
#endif
{
std::scoped_lock lock{free_mutex};
free_queue.push_front(std::move(fence));
gpu_tick.store(host_tick, std::memory_order_release);
gpu_tick.store(host_tick);
}
free_cv.notify_all();
free_cv.notify_one();
}
}

View File

@@ -1415,10 +1415,14 @@ bool QueryCacheRuntime::HostConditionalRenderingCompareValues(VideoCommon::Looku
return false;
}
auto driver_id = impl->device.GetDriverID();
const bool is_gpu_high = Settings::IsGPULevelHigh();
if (!is_gpu_high && impl->device.GetDriverID() == VK_DRIVER_ID_INTEL_PROPRIETARY_WINDOWS) {
return true;
}
if ((!is_gpu_high && driver_id == VK_DRIVER_ID_INTEL_PROPRIETARY_WINDOWS) || driver_id == VK_DRIVER_ID_QUALCOMM_PROPRIETARY || driver_id == VK_DRIVER_ID_ARM_PROPRIETARY || driver_id == VK_DRIVER_ID_MESA_TURNIP) {
auto driver_id = impl->device.GetDriverID();
if (driver_id == VK_DRIVER_ID_QUALCOMM_PROPRIETARY ||
driver_id == VK_DRIVER_ID_ARM_PROPRIETARY || driver_id == VK_DRIVER_ID_MESA_TURNIP) {
return true;
}
@@ -1439,6 +1443,7 @@ bool QueryCacheRuntime::HostConditionalRenderingCompareValues(VideoCommon::Looku
}
if (!is_in_bc[0] && !is_in_bc[1]) {
// Both queries are in query cache, it's best to just flush.
return true;
}
HostConditionalRenderingCompareBCImpl(object_1.address, equal_check);

View File

@@ -730,6 +730,9 @@ void RasterizerVulkan::ReleaseFences(bool force) {
void RasterizerVulkan::FlushAndInvalidateRegion(DAddr addr, u64 size,
VideoCommon::CacheType which) {
if (Settings::IsGPULevelExtreme()) {
FlushRegion(addr, size, which);
}
InvalidateRegion(addr, size, which);
}

View File

@@ -1269,17 +1269,6 @@ void TextureCacheRuntime::ConvertImage(Framebuffer* dst, ImageView& dst_view, Im
case PixelFormat::R32G32_FLOAT:
case PixelFormat::R32G32_SINT:
case PixelFormat::R32_FLOAT:
if (src_view.format == PixelFormat::D32_FLOAT) {
const Region2D region{
.start = {0, 0},
.end = {static_cast<s32>(dst->RenderArea().width),
static_cast<s32>(dst->RenderArea().height)},
};
return blit_image_helper.BlitColor(dst, src_view, region, region,
Tegra::Engines::Fermi2D::Filter::Point,
Tegra::Engines::Fermi2D::Operation::SrcCopy);
}
break;
case PixelFormat::R16_FLOAT:
case PixelFormat::R16_UNORM:
case PixelFormat::R16_SNORM:

View File

@@ -663,7 +663,11 @@ Device::Device(VkInstance instance_, vk::PhysicalDevice physical_, VkSurfaceKHR
break;
}
if (!Settings::values.vertex_input_dynamic_state.GetValue() || !extensions.extended_dynamic_state) {
if (!extensions.extended_dynamic_state) {
Settings::values.vertex_input_dynamic_state.SetValue(false);
}
if (!Settings::values.vertex_input_dynamic_state.GetValue()) {
RemoveExtensionFeature(extensions.vertex_input_dynamic_state, features.vertex_input_dynamic_state, VK_EXT_VERTEX_INPUT_DYNAMIC_STATE_EXTENSION_NAME);
}

View File

@@ -104,6 +104,9 @@ add_executable(yuzu
configuration/configure_input_profile_dialog.cpp
configuration/configure_input_profile_dialog.h
configuration/configure_input_profile_dialog.ui
configuration/configure_linux_tab.cpp
configuration/configure_linux_tab.h
configuration/configure_linux_tab.ui
configuration/configure_mouse_panning.cpp
configuration/configure_mouse_panning.h
configuration/configure_mouse_panning.ui

View File

@@ -40,6 +40,8 @@ void ConfigureGeneral::SetConfiguration() {}
void ConfigureGeneral::Setup(const ConfigurationShared::Builder& builder) {
QLayout& general_layout = *ui->general_widget->layout();
QLayout& linux_layout = *ui->linux_widget->layout();
std::map<u32, QWidget*> general_hold{};
std::map<u32, QWidget*> linux_hold{};
@@ -52,6 +54,13 @@ void ConfigureGeneral::Setup(const ConfigurationShared::Builder& builder) {
};
push(UISettings::values.linkage.by_category[Settings::Category::UiGeneral]);
push(Settings::values.linkage.by_category[Settings::Category::Linux]);
// Only show Linux group on Unix
#ifndef __unix__
ui->LinuxGroupBox->setVisible(false);
#endif
for (const auto setting : settings) {
auto* widget = builder.BuildWidget(setting, apply_funcs);
@@ -67,6 +76,9 @@ void ConfigureGeneral::Setup(const ConfigurationShared::Builder& builder) {
case Settings::Category::UiGeneral:
general_hold.emplace(setting->Id(), widget);
break;
case Settings::Category::Linux:
linux_hold.emplace(setting->Id(), widget);
break;
default:
widget->deleteLater();
}
@@ -75,6 +87,9 @@ void ConfigureGeneral::Setup(const ConfigurationShared::Builder& builder) {
for (const auto& [id, widget] : general_hold) {
general_layout.addWidget(widget);
}
for (const auto& [id, widget] : linux_hold) {
linux_layout.addWidget(widget);
}
}
// Called to set the callback when resetting settings to defaults

View File

@@ -46,6 +46,33 @@
</layout>
</widget>
</item>
<item>
<widget class="QGroupBox" name="LinuxGroupBox">
<property name="title">
<string>Linux</string>
</property>
<layout class="QVBoxLayout" name="LinuxVerticalLayout_1">
<item>
<widget class="QWidget" name="linux_widget" native="true">
<layout class="QVBoxLayout" name="LinuxVerticalLayout_2">
<property name="leftMargin">
<number>0</number>
</property>
<property name="topMargin">
<number>0</number>
</property>
<property name="rightMargin">
<number>0</number>
</property>
<property name="bottomMargin">
<number>0</number>
</property>
</layout>
</widget>
</item>
</layout>
</widget>
</item>
<item>
<spacer name="verticalSpacer">
<property name="orientation">

View File

@@ -0,0 +1,75 @@
// SPDX-FileCopyrightText: Copyright 2019 yuzu Emulator Project
// SPDX-License-Identifier: GPL-2.0-or-later
#include "common/settings.h"
#include "core/core.h"
#include "ui_configure_linux_tab.h"
#include "yuzu/configuration/configuration_shared.h"
#include "yuzu/configuration/configure_linux_tab.h"
#include "yuzu/configuration/shared_widget.h"
ConfigureLinuxTab::ConfigureLinuxTab(const Core::System& system_,
std::shared_ptr<std::vector<ConfigurationShared::Tab*>> group_,
const ConfigurationShared::Builder& builder, QWidget* parent)
: Tab(group_, parent), ui(std::make_unique<Ui::ConfigureLinuxTab>()), system{system_} {
ui->setupUi(this);
Setup(builder);
SetConfiguration();
}
ConfigureLinuxTab::~ConfigureLinuxTab() = default;
void ConfigureLinuxTab::SetConfiguration() {}
void ConfigureLinuxTab::Setup(const ConfigurationShared::Builder& builder) {
QLayout& linux_layout = *ui->linux_widget->layout();
std::map<u32, QWidget*> linux_hold{};
std::vector<Settings::BasicSetting*> settings;
const auto push = [&](Settings::Category category) {
for (const auto setting : Settings::values.linkage.by_category[category]) {
settings.push_back(setting);
}
};
push(Settings::Category::Linux);
for (auto* setting : settings) {
auto* widget = builder.BuildWidget(setting, apply_funcs);
if (widget == nullptr) {
continue;
}
if (!widget->Valid()) {
widget->deleteLater();
continue;
}
linux_hold.insert({setting->Id(), widget});
}
for (const auto& [id, widget] : linux_hold) {
linux_layout.addWidget(widget);
}
}
void ConfigureLinuxTab::ApplyConfiguration() {
const bool is_powered_on = system.IsPoweredOn();
for (const auto& apply_func : apply_funcs) {
apply_func(is_powered_on);
}
}
void ConfigureLinuxTab::changeEvent(QEvent* event) {
if (event->type() == QEvent::LanguageChange) {
RetranslateUI();
}
QWidget::changeEvent(event);
}
void ConfigureLinuxTab::RetranslateUI() {
ui->retranslateUi(this);
}

View File

@@ -0,0 +1,44 @@
// SPDX-FileCopyrightText: Copyright 2023 yuzu Emulator Project
// SPDX-License-Identifier: GPL-2.0-or-later
#pragma once
#include <QWidget>
namespace Core {
class System;
}
namespace Ui {
class ConfigureLinuxTab;
}
namespace ConfigurationShared {
class Builder;
}
class ConfigureLinuxTab : public ConfigurationShared::Tab {
Q_OBJECT
public:
explicit ConfigureLinuxTab(const Core::System& system_,
std::shared_ptr<std::vector<ConfigurationShared::Tab*>> group,
const ConfigurationShared::Builder& builder,
QWidget* parent = nullptr);
~ConfigureLinuxTab() override;
void ApplyConfiguration() override;
void SetConfiguration() override;
private:
void changeEvent(QEvent* event) override;
void RetranslateUI();
void Setup(const ConfigurationShared::Builder& builder);
std::unique_ptr<Ui::ConfigureLinuxTab> ui;
const Core::System& system;
std::vector<std::function<void(bool)>> apply_funcs{};
};

View File

@@ -0,0 +1,53 @@
<?xml version="1.0" encoding="UTF-8"?>
<ui version="4.0">
<class>ConfigureLinuxTab</class>
<widget class="QWidget" name="ConfigureLinuxTab">
<property name="accessibleName">
<string>Linux</string>
</property>
<layout class="QVBoxLayout">
<item>
<widget class="QGroupBox" name="LinuxGroupBox">
<property name="title">
<string>Linux</string>
</property>
<layout class="QVBoxLayout" name="LinuxVerticalLayout_1">
<item>
<widget class="QWidget" name="linux_widget" native="true">
<layout class="QVBoxLayout" name="LinuxVerticalLayout_2">
<property name="leftMargin">
<number>0</number>
</property>
<property name="topMargin">
<number>0</number>
</property>
<property name="rightMargin">
<number>0</number>
</property>
<property name="bottomMargin">
<number>0</number>
</property>
</layout>
</widget>
</item>
</layout>
</widget>
</item>
<item>
<spacer name="verticalSpacer">
<property name="orientation">
<enum>Qt::Vertical</enum>
</property>
<property name="sizeHint" stdset="0">
<size>
<width>20</width>
<height>40</height>
</size>
</property>
</spacer>
</item>
</layout>
</widget>
<resources/>
<connections/>
</ui>

View File

@@ -37,6 +37,7 @@
#include "yuzu/configuration/configure_graphics_advanced.h"
#include "yuzu/configuration/configure_graphics_extensions.h"
#include "yuzu/configuration/configure_input_per_game.h"
#include "yuzu/configuration/configure_linux_tab.h"
#include "yuzu/configuration/configure_per_game.h"
#include "yuzu/configuration/configure_per_game_addons.h"
#include "yuzu/configuration/configure_system.h"
@@ -67,6 +68,7 @@ ConfigurePerGame::ConfigurePerGame(QWidget* parent, u64 title_id_, const std::st
system_, vk_device_records, [&]() { graphics_advanced_tab->ExposeComputeOption(); },
[](Settings::AspectRatio, Settings::ResolutionSetup) {}, tab_group, *builder, this);
input_tab = std::make_unique<ConfigureInputPerGame>(system_, game_config.get(), this);
linux_tab = std::make_unique<ConfigureLinuxTab>(system_, tab_group, *builder, this);
system_tab = std::make_unique<ConfigureSystem>(system_, tab_group, *builder, this);
network_tab = std::make_unique<ConfigureNetwork>(system_, this);
@@ -82,6 +84,13 @@ ConfigurePerGame::ConfigurePerGame(QWidget* parent, u64 title_id_, const std::st
ui->tabWidget->addTab(input_tab.get(), tr("Input Profiles"));
ui->tabWidget->addTab(network_tab.get(), tr("Network"));
// Only show Linux tab on Unix
linux_tab->setVisible(false);
#ifdef __unix__
linux_tab->setVisible(true);
ui->tabWidget->addTab(linux_tab.get(), tr("Linux"));
#endif
setFocusPolicy(Qt::ClickFocus);
setWindowTitle(tr("Properties"));

View File

@@ -36,6 +36,7 @@ class ConfigureGraphics;
class ConfigureGraphicsAdvanced;
class ConfigureGraphicsExtensions;
class ConfigureInputPerGame;
class ConfigureLinuxTab;
class ConfigureSystem;
class ConfigureNetwork;
@@ -91,6 +92,7 @@ private:
std::unique_ptr<ConfigureGraphicsExtensions> graphics_extensions_tab;
std::unique_ptr<ConfigureGraphics> graphics_tab;
std::unique_ptr<ConfigureInputPerGame> input_tab;
std::unique_ptr<ConfigureLinuxTab> linux_tab;
std::unique_ptr<ConfigureSystem> system_tab;
std::unique_ptr<ConfigureNetwork> network_tab;
};

Some files were not shown because too many files have changed in this diff Show More