In RAD Studio 10.3 we’ve made an interesting very low-level change that affects both Delphi and C++. This is a change in the how method parameters are passed when a method is invoked.
For the vast majority of customers, this issue won’t affect you -- or if it does, you will see the effect in terms of bugs that no longer occur.
However, you may be interested in what we changed, why, and the technical details, including if you write low-level assembly code.
C++Builder consumes Delphi .pas files. It does so by the Delphi compiler emitting a C++ header file (.hpp) containing C++-ized versions of the types, methods, and classes in a Delphi unit.
For example, a Delphi method:
procedure Foo(APoint : TPoint);
void Foo(const TPoint& APoint);
This allows Delphi code to be easily used from C++. From the C++ compiler’s point of view, it’s using C++ types and methods.
The eagle-eyed might have already spotted a problem with the above conversion. The Delphi code just passes a TPoint parameter, by value. But on the C++ side codifies this as by reference. Why?
This is the difference between an API (application programming interface, meaning the method declaration that you as a developer read and code against, including semantics such as passing by value) and ABI (application binary interface, meaning the way this is implemented under the hood, including how parameters are passed, such as in a register or on the stack.)
Delphi passes the variable by value; that is as a user (programming interface) it is a by-val parameter: you can read it, but writing has no effect on the variable the calling method passed, for example. These are the programming semantics: you know, reading the API, that regardless of how the compiler generates the assembly code you can rely on specific behaviour, in this case it having value semantics. It could actually be implemented any way at all, as a pointer to a pointer to a pointer to a pointer to a memory location (to choose something absurd) so long as the compiler hid that implementation from you, and using it in your code, it behaved as by-value.
In C++, it’s written as passed by reference and is constant; this is clearly not by-value but in terms of API semantics you cannot modify it, which this achieves the same thing.
In this case, a TPoint is an 8-byte value. On Win32, where registers used for parameter passing are 4-byte, it is passed as a reference (that is, as a pointer.) On Win64, however, the same registers are 8-byte which means the entire value can fit in a register. Rather than passing it by reference, requiring accessing the value by indirection, it could simply be passed in the register itself. That’s great codegen, because it’s more efficient.
C++ header generation was originally designed in the late nineties, when Windows 32-bit was the only platform around. As such, the distinction between API and ABI wasn’t taken into account. Because this TPoint value was passed by reference, although was accessed in code as a value, the generated C++ header codified that:
void Foo(const TPoint& APoint); ← the & indicates by reference. Const makes it unmodifiable
And suddenly, an implementation detail had leaked into the API. TPoint had to be passed by reference. But on Win64, it was not.
This leaves us with a situation where on Win32, C++ will correctly find the TPoint value by reference (an indirection, ie the register contains a pointer to the value.) On Win64, it will also treat the register as a pointer to the value, because that’s what the API says. However, the pointer itself will be incorrect, because the register actually contains the TPoint value itself. Treating that same 8 bytes of memory as a pointer will effectively point to a random location, and so the TPoint variable accessed in code will contain junk data.
There were a number of bug reports on the issue, including RSP-16209 and QC-115283. The issue was also documented. This issue affects all types or structs between 5 and 8 bytes in size. Below 4 bytes, and above 8 bytes, register/other behaviour was identical on both Win32 and Win64.
In RAD Studio 10.3, we wanted to solve the issue. There are two main ways to solve it:
In 10.3, we made changes to parameter passing for several calling conventions, with a focus on parameters between 5 and 8 bytes in size (that is, 5, 6, 7 and 8 bytes.) We also reviewed all platforms, including iOS, Android, Linux, macOS, and Windows, and all bitnesses we support on those platforms, as well as all calling conventions including cdecl, pascal, fastcall, register, and stdcall / winapi. (Many of these, while having a specification in the same way the Win64 ABI has a specification, often have differences between compilers and platforms. It’s one reason that if making or using a DLL, you normally choose cdecl or stdcall: they are closer to standardized. If you want to dig into details, Raymond Chen has an interesting series on the history of calling conventions, and Agner Fog has an excellent PDF on calling conventions.)
The vast majority of types had no changes. This includes floating point types like Single, Double, Extended, class references, pointers, dynamic arrays, strings, and so forth.
We believe the tweaks we made are compatible with platform expectations, with other compilers, and importantly between our own compilers.
We don’t tend to publish internals like this on the docwiki, because they can change release to release. However, if you need information (for example, perhaps you work on low-level code for profilers or debuggers, or have code for stack manipulation, inline assembly and naked functions, or similar) feel free to contact us.
You should see:
In other words, this is the best kind of bugfix: you won’t see an effect and yet bugs have vanished and stopped occurring.
I would rather know when a debugger with clang will work without errors.
Thanks for looking into this. Really helpful!
With the old linker, whenever a call to one of my ASM routines was made, it would use the BLX instruction to switch to ARM32 mode. Now, it uses the BL instruction instead, so it stays in THUMB mode. So it's not doing the shim magic anymore as you said.
Anyway, it is good to know where the issue originates, so I can make sure my ASM routines keep working on both older and new Delphi (or linker) versions.
A reply - yes, was to do with the NDK, specifically the linker. Quoting:You might be running into an issue we also run into where the newer Android linker is stricter about when it inserts the ARM interworking shims (see https://sourceware.org/ml/binutils/2012-03/msg00121.html). In other words, we generate THUMB; if your code was in ARM32 and the linker inserted the shims, all was well. Now the linker is stricter about when it inserts the shims, so now the code must be rewritten in THUMB. Or, another way to address the issue, is to make sure that the code meets the requirement for when the linker will handle the switch, which is what we did.
Glad it's useful!
Re ARM32 vs THUMB: I'll have to consult a compiler engineer. I do recall that when we updated the Android NDK we changed something to do with the startup code, which I think is Thumb.
Thanks for this clarification. I noticed I had to change some Win64 assembly code in my FastMath project, and now I know why!
I also noticed that the ABI for Android changed. I've written quite a lot of Neon assembly code, and this all stopped working with the new Delphi version. I used to assemble this code in ARM32 mode, but since Rio, it needs to be assembled in THUMB mode. This is only for Android though. For iOS32, it still needs ARM32 mode. So I end up with two different static libraries for Android depending on Delphi version.
Do you know anything about this ABI change?