Compare commits

..

4 Commits

Author SHA1 Message Date
xbzk
ce4877b424 skip 1st 2nd dispatches extended to linux. vk_staging_buffer_pool.cpp clamp moved to buffer_cache.h 2025-12-26 05:08:12 +01:00
xbzk
6f4ade37e1 maxwell_dma: multisized components support added (untested) 2025-12-26 05:08:12 +01:00
xbzk
09f06a9a41 MCI boot fix (android): skipping problematic initial pair of dispatches 2025-12-26 05:08:12 +01:00
xbzk
a701ea274f Minimal stopgaps for MCI to boot
- Clamp staging buffer size to 2GB to prevent Vulkan allocation failures
- Add size validation in MappedUploadMemory to avoid buffer overruns
2025-12-26 05:08:12 +01:00
8 changed files with 51 additions and 92 deletions

View File

@@ -22,15 +22,6 @@ Or use Qt Creator (Create Project -> Import Project -> Git Clone).
Android has a completely different build process than other platforms. See its [dedicated page](build/Android.md).
## Cross-compile ARM64
A painless guide for cross compilation (or to test NCE) from a x86_64 system without polluting your main.
- Install QEMU: `sudo pkg install qemu`
- Download Debian 13: `wget https://cdimage.debian.org/debian-cd/current/arm64/iso-cd/debian-13.0.0-arm64-netinst.iso`
- Create a system disk: `qemu-img create -f qcow2 debian-13-arm64-ci.qcow2 30G`
- Run the VM: `qemu-system-aarch64 -M virt -m 2G -cpu max -bios /usr/local/share/qemu/edk2-aarch64-code.fd -drive if=none,file=debian-13.0.0-arm64-netinst.iso,format=raw,id=cdrom -device scsi-cd,drive=cdrom -drive if=none,file=debian-13-arm64-ci.qcow2,id=hd0,format=qcow2 -device virtio-blk-device,drive=hd0 -device virtio-gpu-pci -device usb-ehci -device usb-kbd -device intel-hda -device hda-output -nic user,model=virtio-net-pci`
## Initial Configuration
If the configure phase fails, see the `Troubleshooting` section below. Usually, as long as you followed the dependencies guide, the defaults *should* successfully configure and build.

View File

@@ -93,11 +93,6 @@ Eden is not currently available as a port on FreeBSD, though it is in the works.
The available OpenSSL port (3.0.17) is out-of-date, and using a bundled static library instead is recommended; to do so, add `-DYUZU_USE_BUNDLED_OPENSSL=ON` to your CMake configure command.
Gamepad/controllers may not work on 15.0, this is due to an outdated SDL not responding well to the new `usbhid(2)` driver. To workaround this simply disable `usbhid(2)` (add the following to `/boot/loader.conf`):
```sh
hw.usb.usbhid.enable="0"
```
## NetBSD
Install `pkgin` if not already `pkg_add pkgin`, see also the general [pkgsrc guide](https://www.netbsd.org/docs/pkgsrc/using.html). For NetBSD 10.1 provide `echo 'PKG_PATH="https://cdn.netbsd.org/pub/pkgsrc/packages/NetBSD/x86_64/10.0_2025Q3/All/"' >/etc/pkg_install.conf`. If `pkgin` is taking too much time consider adding the following to `/etc/rc.conf`:
@@ -201,24 +196,14 @@ windeployqt6 --no-compiler-runtime --no-opengl-sw --no-system-dxc-compiler \
find ./*/ -name "*.dll" | while read -r dll; do deps "$dll"; done
```
## RedoxOS
The package install may randomly hang at times, in which case it has to be restarted. ALWAYS do a `sudo pkg update` or the chances of it hanging will be close to 90%. If "multiple" installs fail at once, try installing 1 by 1 the packages.
When CMake invokes certain file syscalls - it may sometimes cause crashes or corruptions on the (kernel?) address space - so reboot the system if there is a "hang" in CMake.
## Windows
### Windows 7, Windows 8 and Windows 8.1
## Windows 8.1 and below
DirectX 12 is not available - simply copy and paste a random DLL and name it `d3d12.dll`.
Install [Qt6 compatibility libraries](github.com/ANightly/qt6windows7) specifically Qt 6.9.5.
### Windows Vista and below
## RedoxOS
No support for Windows Vista (or below) is present at the moment. Check back later.
The package install may randomly hang at times, in which case it has to be restarted. ALWAYS do a `sudo pkg update` or the chances of it hanging will be close to 90%. If "multiple" installs fail at once, try installing 1 by 1 the packages.
### Windows on ARM
If you're using Snapdragon X or 8CX, use the [the Vulkan translation layer](https://apps.microsoft.com/detail/9nqpsl29bfff?hl=en-us&gl=USE) only if the stock drivers do not work. And of course always keep your system up-to-date.
When CMake invokes certain file syscalls - it may sometimes cause crashes or corruptions on the (kernel?) address space - so reboot the system if there is a "hang" in CMake.

View File

@@ -10,13 +10,7 @@ Simply put, types/classes are named as `PascalCase`, same for methods and functi
Except for Qt MOC where `functionName` is preferred.
Template typenames prefer short names like `T`, `I`, `U`, if a longer name is required either `Iterator` or `perform_action` are fine as well. Do not use names like `SS` as systems like solaris define it for registers, in general do not use any of the following for short names:
- `SS`, `DS`, `GS`, `FS`: Segment registers, defined by Solaris `<ucontext.h>`
- `EAX`, `EBX`, `ECX`, `EDX`, `ESI`, `EDI`, `ESP`, `EBP`, `EIP`: Registers, defined by Solaris.
- `X`: Defined by some utility headers, avoid.
- `_`: Defined by gettext, avoid.
- `N`, `M`, `S`: Preferably don't use this for types, use it for numeric constants.
- `TR`: Used by some weird `<ucontext.h>` whom define the Task Register as a logical register to provide to the user... (Need to remember which OS in specific).
Template typenames prefer short names like `T`, `I`, `U`, if a longer name is required either `Iterator` or `perform_action` are fine as well.
Macros must always be in `SCREAMING_CASE`. Do not use short letter macros as systems like Solaris will conflict with them; a good rule of thumb is >5 characters per macro - i.e `THIS_MACRO_IS_GOOD`, `AND_ALSO_THIS_ONE`.
@@ -24,45 +18,25 @@ Try not using hungarian notation, if you're able.
## Formatting
Formatting is extremelly lax, the general rule of thumb is: Don't add new lines just to increase line count. The less lines we have to look at, the better. This means also packing densely your code while not making it a clusterfuck. Strike a balance of "this is a short and comprehensible piece of code" and "my eyes are actually happy to see this!". Don't just drop the entire thing in a single line and call it "dense code", that's just spaghetti posing as code. In general, be mindful of what other devs need to look at.
Do not put if/while/etc braces after lines:
```c++
// no dont do this
// this is more lines of code for no good reason (why braces need their separate lines?)
// and those take space in someone's screen, cumulatively
if (thing)
{ //<--
{
some(); // ...
} //<-- 2 lines of code for basically "opening" and "closing" an statment
}
// do this
if (thing) { //<-- [...] and with your brain you can deduce it's this piece of code
// that's being closed
if (thing) {
some(); // ...
} //<-- only one line, and it's clearer since you know its closing something [...]
}
// or this, albeit the extra line isn't needed (at your discretion of course)
// or this
if (thing)
some(); // ...
// this is also ok, keeps things in one line and makes it extremely clear
// this is also ok
if (thing) some();
// NOT ok, don't be "clever" and use the comma operator to stash a bunch of statments
// in a single line, doing this will definitely ruin someone's day - just do the thing below
// vvv
if (thing) some(), thing(), a2(a1(), y1(), j1()), do_complex_shit(wa(), wo(), ploo());
// ... and in general don't use the comma operator for "multiple statments", EXCEPT if you think
// that it makes the code more readable (the situation may be rare however)
// Wow so much clearer! Now I can actually see what each statment is meant to do!
if (thing) {
some();
thing();
a2(a1(), y1(), j1());
do_complex_shit(wa(), wo(), ploo());
}
```
Brace rules are lax, if you can get the point across, do it:
@@ -103,21 +77,3 @@ if (device_name.empty()) {
SDL_AudioSpec obtained;
device = SDL_OpenAudioDevice(device_name.empty() ? nullptr : device_name.c_str(), capture, &spec, &obtained, false);
```
A note about operators: Use them sparingly, yes, the language is lax on them, but some usages can be... tripping to say the least.
```c++
a, b, c; //<-- NOT OK multiple statments with comma operator is definitely a recipe for disaster
return c ? a : b; //<-- OK ternaries at end of return statments are clear and fine
return a, b; //<-- NOT OK return will take value of `b` but also evaluate `a`, just use a separate statment
void f(int a[]) //<-- OK? if you intend to use the pointer as an array, otherwise just mark it as *
```
And about templates, use them sparingly, don't just do meta-templating for the sake of it, do it when you actually need it. This isn't a competition to see who can make the most complicated and robust meta-templating system. Just use what works, and preferably stick to the standard libary instead of reinventing the wheel. Additionally:
```c++
// NOT OK This will create (T * N * C * P) versions of the same function. DO. NOT. DO. THIS.
template<typename T, size_t N, size_t C, size_t P> inline void what() const noexcept;
// OK use parameters like a normal person, don't be afraid to use them :)
template<typename T> inline void what(size_t n, size_t c, size_t p) const noexcept;
```

10
docs/CrossCompile.md Normal file
View File

@@ -0,0 +1,10 @@
# Cross Compile
## ARM64
A painless guide for cross compilation (or to test NCE) from a x86_64 system without polluting your main.
- Install QEMU: `sudo pkg install qemu`
- Download Debian 13: `wget https://cdimage.debian.org/debian-cd/current/arm64/iso-cd/debian-13.0.0-arm64-netinst.iso`
- Create a system disk: `qemu-img create -f qcow2 debian-13-arm64-ci.qcow2 30G`
- Run the VM: `qemu-system-aarch64 -M virt -m 2G -cpu max -bios /usr/local/share/qemu/edk2-aarch64-code.fd -drive if=none,file=debian-13.0.0-arm64-netinst.iso,format=raw,id=cdrom -device scsi-cd,drive=cdrom -drive if=none,file=debian-13-arm64-ci.qcow2,id=hd0,format=qcow2 -device virtio-blk-device,drive=hd0 -device virtio-gpu-pci -device usb-ehci -device usb-kbd -device intel-hda -device hda-output -nic user,model=virtio-net-pci`

View File

@@ -294,22 +294,22 @@ sudo pkg install qt6 boost glslang libzip library/lz4 libusb-1 nlohmann-json ope
* Open the `MSYS2 MinGW 64-bit` shell (`mingw64.exe`)
* Download and install all dependencies:
```sh
```
BASE="git make autoconf libtool automake-wrapper jq patch"
MINGW="qt6-base qt6-tools qt6-translations qt6-svg cmake toolchain clang python-pip openssl vulkan-memory-allocator vulkan-devel glslang boost fmt lz4 nlohmann-json zlib zstd enet opus mbedtls libusb unordered_dense openssl SDL2"
# Either x86_64 or clang-aarch64 (Windows on ARM)
MINGW="qt6-base qt6-tools qt6-translations qt6-svg cmake toolchain clang python-pip openssl vulkan-memory-allocator vulkan-devel glslang boost fmt lz4 nlohmann-json zlib zstd enet opus mbedtls libusb unordered_dense"
packages="$BASE"
for pkg in $MINGW; do
packages="$packages mingw-w64-x86_64-$pkg"
#packages="$packages mingw-w64-clang-aarch64-$pkg"
done
pacman -Syuu --needed --noconfirm $packages
```
* Notes:
- Using `qt6-static` is possible but currently untested.
- Other environments are entirely untested, but should theoretically work provided you install all the necessary packages.
- GCC is proven to work better with the MinGW environment. If you choose to use Clang, you *may* be better off using the clang64 environment.
- ...except on ARM64, we recommend using clang instead of GCC for Windows ARM64.
- Add `qt-creator` to the `MINGW` variable to install Qt Creator. You can then create a Start Menu shortcut to the MinGW Qt Creator by running `powershell "\$s=(New-Object -COM WScript.Shell).CreateShortcut('C:\\ProgramData\\Microsoft\\Windows\\Start Menu\\Programs\\Qt Creator.lnk');\$s.TargetPath='C:\\msys64\\mingw64\\bin\\qtcreator.exe';\$s.Save()"` in Git Bash or MSYS2.
* Add MinGW binaries to the PATH if they aren't already:
* `echo 'PATH=/mingw64/bin:$PATH' >> ~/.bashrc`

View File

@@ -1508,7 +1508,10 @@ void BufferCache<P>::MappedUploadMemory([[maybe_unused]] Buffer& buffer,
[[maybe_unused]] u64 total_size_bytes,
[[maybe_unused]] std::span<BufferCopy> copies) {
if constexpr (USE_MEMORY_MAPS) {
auto upload_staging = runtime.UploadStagingBuffer(total_size_bytes);
constexpr u64 MAX_STAGING_SIZE = 2_GiB;
auto upload_staging = runtime.UploadStagingBuffer((std::min)(total_size_bytes, MAX_STAGING_SIZE));
if (upload_staging.mapped_span.size() < total_size_bytes) return;
//auto upload_staging = runtime.UploadStagingBuffer(total_size_bytes);
const std::span<u8> staging_pointer = upload_staging.mapped_span;
for (BufferCopy& copy : copies) {
u8* const src_pointer = staging_pointer.data() + copy.src_offset;

View File

@@ -92,19 +92,25 @@ void MaxwellDMA::Launch() {
}
}
} else {
// TODO: allow multisized components.
// TODO: xbzk: multisized components support.
// validadte this widely!
// shipped in PR 3164.
auto& accelerate = rasterizer->AccessAccelerateDMA();
const bool is_const_a_dst = regs.remap_const.dst_x == RemapConst::Swizzle::CONST_A;
if (regs.launch_dma.remap_enable != 0 && is_const_a_dst) {
ASSERT(regs.remap_const.component_size_minus_one == 3);
const u32 remap_components_size = regs.remap_const.component_size_minus_one + 1;
accelerate.BufferClear(regs.offset_out, regs.line_length_in,
regs.remap_const.remap_consta_value);
read_buffer.resize_destructive(regs.line_length_in * sizeof(u32));
std::span<u32> span(reinterpret_cast<u32*>(read_buffer.data()), regs.line_length_in);
std::ranges::fill(span, regs.remap_const.remap_consta_value);
read_buffer.resize_destructive(regs.line_length_in * remap_components_size);
for (u32 i = 0; i < regs.line_length_in; ++i) {
for (u32 j = 0; j < remap_components_size; ++j) {
read_buffer[i * remap_components_size + j] =
(regs.remap_const.remap_consta_value >> (j * 8)) & 0xFF;
}
}
memory_manager.WriteBlockUnsafe(regs.offset_out,
reinterpret_cast<u8*>(read_buffer.data()),
regs.line_length_in * sizeof(u32));
read_buffer.data(),
regs.line_length_in * remap_components_size);
} else {
memory_manager.FlushCaching();
const auto convert_linear_2_blocklinear_addr = [](u64 address) {

View File

@@ -479,6 +479,14 @@ void RasterizerVulkan::Clear(u32 layer_count) {
}
void RasterizerVulkan::DispatchCompute() {
#if defined(ANDROID) || defined(__linux__)
static u32 dispatch_count = 0;
if (dispatch_count < 2) {
dispatch_count++;
return;
}
#endif
FlushWork();
gpu_memory->FlushCaching();