ABI Changes in RAD Studio 10.3

by Jan 23, 2019

In RAD Studio 10.3 we’ve made an interesting very low-level change that affects both Delphi and C++. This is a change in the how method parameters are passed when a method is invoked.

For the vast majority of customers, this issue won’t affect you — or if it does, you will see the effect in terms of bugs that no longer occur.

However, you may be interested in what we changed, why, and the technical details, including if you write low-level assembly code.

C++ header generation

C++Builder consumes Delphi .pas files. It does so by the Delphi compiler emitting a C++ header file (.hpp) containing C++-ized versions of the types, methods, and classes in a Delphi unit.

For example, a Delphi method:

procedure Foo(APoint : TPoint);

becomes

void Foo(const TPoint& APoint);

This allows Delphi code to be easily used from C++. From the C++ compiler’s point of view, it’s using C++ types and methods.

ABI vs API

The eagle-eyed might have already spotted a problem with the above conversion. The Delphi code just passes a TPoint parameter, by value. But on the C++ side codifies this as by reference. Why?

This is the difference between an API (application programming interface, meaning the method declaration that you as a developer read and code against, including semantics such as passing by value) and ABI (application binary interface, meaning the way this is implemented under the hood, including how parameters are passed, such as in a register or on the stack.)

Delphi passes the variable by value; that is as a user (programming interface) it is a by-val parameter: you can read it, but writing has no effect on the variable the calling method passed, for example. These are the programming semantics: you know, reading the API, that regardless of how the compiler generates the assembly code you can rely on specific behaviour, in this case it having value semantics. It could actually be implemented any way at all, as a pointer to a pointer to a pointer to a pointer to a memory location (to choose something absurd) so long as the compiler hid that implementation from you, and using it in your code, it behaved as by-value.

In C++, it’s written as passed by reference and is constant; this is clearly not by-value but in terms of API semantics you cannot modify it, which this achieves the same thing.

In this case, a TPoint is an 8-byte value. On Win32, where registers used for parameter passing are 4-byte, it is passed as a reference (that is, as a pointer.) On Win64, however, the same registers are 8-byte which means the entire value can fit in a register. Rather than passing it by reference, requiring accessing the value by indirection, it could simply be passed in the register itself. That’s great codegen, because it’s more efficient.

C++ header generation was originally designed in the late nineties, when Windows 32-bit was the only platform around. As such, the distinction between API and ABI wasn’t taken into account. Because this TPoint value was passed by reference, although was accessed in code as a value, the generated C++ header codified that:

void Foo(const TPoint& APoint); ← the & indicates by reference. Const makes it unmodifiable

And suddenly, an implementation detail had leaked into the API. TPoint had to be passed by reference. But on Win64, it was not.

Win64 C++ Compatibility

This leaves us with a situation where on Win32, C++ will correctly find the TPoint value by reference (an indirection, ie the register contains a pointer to the value.) On Win64, it will also treat the register as a pointer to the value, because that’s what the API says. However, the pointer itself will be incorrect, because the register actually contains the TPoint value itself. Treating that same 8 bytes of memory as a pointer will effectively point to a random location, and so the TPoint variable accessed in code will contain junk data.

There were a number of bug reports on the issue, including RSP-16209 and QC-115283. The issue was also documented. This issue affects all types or structs between 5 and 8 bytes in size. Below 4 bytes, and above 8 bytes, register/other behaviour was identical on both Win32 and Win64.

Solution

In RAD Studio 10.3, we wanted to solve the issue. There are two main ways to solve it:

  1. Remove the accidentally encoded ABI information from the generated header API. That is, ensure that the C++ method translation did not force a specific implementation. It carries API semantics only, without mixing in the ABI. The implementation is on the compiler side, and both Delphi and C++ compilers would have a common understanding of what ‘by value’ meant, in the ABI, for parameters of specific sizes.This would change the method declarations for headers. In this specific case, a const TPoint& in today’s translation would become a TPoint, to reflect the ‘by value semantics’. While that’s fine on a compiler level, every C++ customer would suddenly find that form designer event handlers, or C++ descendant classes from Delphi code, in their own code no longer matched the headers that would be used. That would require a lot of rewriting, or refactoring, and would be a large barrier to upgrading to the latest release – lots of work for a small gain. Clearly this was not an option we could pursue.

    Long-term, if we can implement it without any manual work required on your own code, we may consider this. There are no plans today.

  2. Adjust the codegen for Win64 such that the headers are valid.Adjust the handling of parameters between 5 and 8 bytes to have a consistent system, such that the generated header is always valid. This is the option we’ve implemented in 10.3.

In 10.3, we made changes to parameter passing for several calling conventions, with a focus on parameters between 5 and 8 bytes in size (that is, 5, 6, 7 and 8 bytes.) We also reviewed all platforms, including iOS, Android, Linux, macOS, and Windows, and all bitnesses we support on those platforms, as well as all calling conventions including cdecl, pascal, fastcall, register, and stdcall / winapi. (Many of these, while having a specification in the same way the Win64 ABI has a specification, often have differences between compilers and platforms. It’s one reason that if making or using a DLL, you normally choose cdecl or stdcall: they are closer to standardized. If you want to dig into details, Raymond Chen has an interesting series on the history of calling conventions, and Agner Fog has an excellent PDF on calling conventions.)

The vast majority of types had no changes. This includes floating point types like Single, Double, Extended, class references, pointers, dynamic arrays, strings, and so forth.

We believe the tweaks we made are compatible with platform expectations, with other compilers, and importantly between our own compilers.

We don’t tend to publish internals like this on the docwiki, because they can change release to release. However, if you need information (for example, perhaps you work on low-level code for profilers or debuggers, or have code for stack manipulation, inline assembly and naked functions, or similar) feel free to contact us.

What Effects Will You See?

You should see:

  • For Delphi, no visible effect – this is not something that is normally dug into by developers. If you have assembly code for Win64, you may want to check its handling of parameters between 5 and 8 bytes.
  • For C++, you will see some bugs have been fixed. Notably, event handlers with TPoint parameters on Win64 will work correctly. Otherwise, just as with Delphi, this is a change that should not normally affect you.
  • If you have Win32 assembly code handling parameters for either language, you should not need to update it.

In other words, this is the best kind of bugfix: you won’t see an effect and yet bugs have vanished and stopped occurring.