- Part 1 - Journey through the .NET internals - Sorting
- Part 2 - List.Sort internals
- Part 3 - Array.Sort && TrySZSort
- Part 4 - Managed vs Unmanaged code and interop
- Part 5 - This Article
In this blog post we will answer the question
What is a calling convention?. A calling convention is like a contract that describes how the functions call each other, on the
assembly level using
It defines things like:
- the way arguments are passed to a function
- how values are returned
- how the function name is decorated
It specifies how (at a low level) the compiler will pass input parameters to the function and retrieve its results once it’s been executed.1
CPU, Machine code and instruction sets
If we go down to the lowest levels of
code, there is a
This BTW is a Fibonacci number generation code in
machine code. I wouldn’t be able to write it that way, but what is important is that on this
lowest level it really doesn’t matter if this code comes from
C#. It would an
impossible task(almost) to write code that way. That is why we have a higher abstraction on top of machine code -
Example below is the same
fibonacci number generation code but in
On this level which is still very low. We operate very close to the
CPU using - registers, stacks, and CPU instructions like
jmp. Every CPU supports different
First micro processor4 had
46 instructions5. These days you can check this list 6, there are hundreds of them. It all started with simple instructions, which were used to generate more complex operations. As these operations become very common, CPU designer added them as new instructions, often designing CPUs to make them more optimized.
Then there is also a difference between
(CISC)x86 processors. The former have smaller number of instructions but require fewer transistors making them more power efficient7.
You can check the difference down below.
It is the same code but on different CPU families with different instruction sets. Due to this difference you need to compile the code for a specific machine. If you are familiar with Linux world, it is pretty standard procedure to download source code of some program and build it itself on your machine for your machine specific context. More popular distributions have packages with already pre-compiled binaries. Usually when you go to a release page of some software - example (ripgrep 8) you will see different binaries, for different operating systems, Linux, kernels or families of CPUs. (BTW ripgrep is an amazing replacement of grep).
This is partially why
virtual machine was created with platforms like
.NET. It helps with portability of software as instead of compiling your code to a specific instruction set. You compile it to intermediary language
Java Bytecode which is then compiled, usually lazily on the fly, by the Virtual machine to this machine specific context. It automates the whole process of building the code for your .
Functions in assembly
On this low level we operate with CPU instructions. The concept of function, argument, returning value from a function doesn’t exist. We can only use
simple primitives like accumulator, registers, stack, label and CPU instructions. These primitives can be used to create more complex code and something similar to functions.
This code is readable and it has concepts of types
int, function, arguments,
+ operator, return and of course scope
When you compile this code to assembly. You get a different view with things like labels
sum:, CPU instructions
mov, add, ret, operation on stack
[esp+4], stack pointer
esp and registers
edx, eax. It is a completely different world.
Looking at this code you might ask:
- Ok I see
retfunction which I assume is return, but how does it work?
- Which value is returned?
- If I call it how will another function how to get the value?
And that is why we have
calling conventions to create a contract with information for functions on how to call each other.
Calling conventions can differ in many ways:
- where are the arguments stored - registers, stack
- where do you put the result of the function call (stack, register, memory)
- who is responsible for clean-up - caller or callee ( this makes a difference in assembly code size, if caller is cleaning up the stack - the compiler has to generate clean-up instructions next to the function call)
- who is responsible for
registersand bringing them back to previous state (before the function was called)
You can check the list of x86 calling conventions here 9. We will use
fastcall as an example.
CDECL and FASTCALL
If one of the functions expects call using
cdecl convention. It is expecting:
- arguments to be on the stack
- caller cleaning the stack
If we then call this function using
fastcall convention both requirements won’t be met:
- for fastcall first
two) arguments are kept in the registers
- stack won’t be cleaned up as fastcall assumes that
calleeis responsible for that.
Source code 10.
This simple function
multiplies numbers. We have function
cdecl which is marked with
cdecl attribute to force this calling convention (this is actually default and this attribute is not needed).
I am compiling this code with these flags:
-m32- forces 32 bit executable - without this flag calling conventions are ignored (couldn’t find why)
-O0- I don’t want to optimize this code as with such a simple example
-O1in the caller puts a static value
(2 * 3 = 6)
-fomit-frame-pointer- one optimization that removes
frame pointersto make the
asmcode a bit simpler. (At the end of this post there is a example without this optimization explained if you are curious what is the difference).
Don’t keep the frame pointer in a register for functions that don’t need one. This avoids the instructions to save, set up and restore frame pointers; it also makes an extra register available in many functions. It also makes debugging impossible on some machines. 11
It removes these instructions.
This simplifis the code to this form.
For comparison lets look at
Source code 12.
third parameter to show that only first
two arguments are passed through the registers.
For simplicity we can simplify this code to this.
There is no need to reserve place on the
stack, move values from registers to the
stack and then get values from the
stack. Compiler potentially does it due to
Arguments are first saved in stack then fetched from stack, rather than be used directly. This is because the compiler wants a consistent way to use all arguments via stack access, not only one compiler does like that. 13
In the end we will analyse this code.
So this is it. Examples of differences between
cdecl. What would happen then if we would
mix conventions. Example below shows what happens when a
calle are not abiding to the same convention.
fastcall still thinks that arguments were passed through registers and obviously there will be
some data. It is not the data passed by the caller as he used
cdecl conventions and passed arguments through the stack. This would generate an unexpected and hard to debug behaviour. That is why
calling conventions are important. There is a long history behind them 14151617