April 26, 2020May 31, 2020 Deepesh Menon

Chapter 4 : The Stacks In C

Key words: stacks in C, execution model of C, function stack frames in C, function calls in C, varargs and variadic functions in C, stack corruption

Topics at a glance:

Stacks and its mysteries
C’s execution model and its companionship with stacks
The famous stack frames of ‘C’
Functions returning big values
Parameter size promotion – The size-default and integers
Variadic function and stack manipulation
The labyrinth of stack corruption – do all roads lead to Rome or can we take another path?

Stacks in C

This chapter discusses about the much anticipated topic in C – the stacks. Stack is a memory that behaves in a Last In First Out (LIFO) manner, for temporary storage of data.

Why stacks are so important in C? What all purposes that stacks serve in C? Stacks exist right from the very start of a C program’s execution and is an integral part of C’s execution model. Stacks have far-reaching role in C language. That said, it’s not surprising to say that more than 50 % of bugs in C code can be directly related to stack issues, while the other half can be indirectly related to stack issues. Am not kidding. Let us see how all stacks are used in C.

As I said, stacks exist right from the beginning of a C program. Setting up stack is required even before C program starts execution. The basic steps that a typical C run time does are as follows:

Clear all BSS regions to 0s
Initialize data segment regions with values from ROM, where usually the executable image is stored to RAM, where the image is loaded and actually executed.
Set up the stack pointer
Jump to C’s first function

In the above steps, initializing BSS, and Data segment is not mandated by the C language. But we do that for getting correct results from the program, or otherwise program will be useless. But the third step i.e. to set up the stack pointer is so crucial that without this program will not even execute. I will explain why stack pointer is so important even for a tiny little C code to execute correctly. But if you are so curious, just see how the assembly code generated by the compiler for your C program looks like. You will see a lot of stack operations here and there especially where function calls/returns are happening.

Let us see for what all purposes C uses stack mainly:

For storing a function’s local variables (a.k.a auto variables)
For passing parameters to a function
For passing the return address to a callee function ( invoked function )
For temporary storage used for computing intermediate expression’s results
For saving processor registers across function calls and returns

So, in short, C won’t work without stack. Is it now clear why point number 3 in C run time’s responsibilities is so important? i.e. to Set up stack pointer.

Execution model of C

C is a procedural programming language. Everything which could be executed in C should be part of a function. So, execution model of C simply means how the functions work, or rather how function-calls work. Let us understand in detail how C uses stack in implementing functions. For this, we need to first understand how stack frames are working. This is implemented differently from processor to processor. i.e. the way in which C compilers generate stack frames for Intel x86 is different from that of ARM 32. Processor manufacturers document properly function calling conventions to be followed by compilers for that processor.

I am taking a typical Intel x86 processor for explaining these.

Function stack frames in C

Let us understand the anatomy of a stack frame generated for a C function on x86.

Before that please note the following points:

Stack grows downwards. i.e. From higher address to lower addresses
For putting things to stack we use a push operation. Assume stack pointer is at 0x1000, if we push one data, data will be copied to 0x1000, and current stack pointer will become 0xFFC (i.e. 0x1000 – 0x4)
If you want to take any data from stack, you perform a pop operation. Let’s assume that current stack pointer is at 0xFFC, if we perform pop operation, stack pointer is incremented to 0x1000, and data from 0x1000 is returned/popped out of stack
Note that size of data moved in/out during stack operations such as push/pop depends on processor’s word length. I have taken Intel x86, on which word length is 4 bytes (32-bits).

Now, let us analyze a stack frame on a typical x86 for a C function as follows:

int function(int param0, int param1)
{
	int var0; 
	int var1; 
	// do some computation here 
	return var0; 
}

EBP: Base pointer register: Indicating the origin of current stack frame. Most of other locations are accessed using offsets to EBP register.

In C, each function during execution, will have a current stack frame starting at an address pointed by EBP register. The above figure shows what all information are stored in stack frame, and also, w.r.t EBP how to access those data from stack frame.

Function calls in C

Let us see the steps involved in functionA calling another functionB and finally returning back to functionA w.r.t how stack frames are manipulated by the C compiler.

Steps done in functionA, prior to calling functionB: Caller is functionA. Callee is functionB

functionA pushes processor registers such as EAX, ECX and EDX (This step is optional and will be done only if registers EAX, ECX and EDX are to be retained across function call. It’s very much likely that these registers will get changed in callee function.)
functionA pushes input parameter values required for functionB.
functionA saves EIP (current instruction pointer) by pushing that into stack. This is the return address used by functionB to return back to functionA.

NOTE: Step 3 is done automatically by ‘call’ instruction.

Steps done in functionB or here the callee function before starting actual computation defined in function body

functionB at first saves current EBP to stack by pushing it to stack.
functionB sets EBP to ESP, which will be used as the new stack frame address for the function
functionB allocates storage for local variables if any
functionB allocates temporary storage in stack if required. (This step is optional and is required only if some complex computation is happening in the function that may produce intermediate results)
functionB then saves EBX, ESI and EDI registers to stack. (This step is optional and is done on a need basis. Source and destination index registers are used when string operations are involved or in cases where there is movement of bulk data)
functionB now starts the actual function execution. i.e code written inside function body, will start execution only here.

Steps done before functionB returns back to functionA

functionB saves return value to EAX register (Intel convention)
functionB restores values of EBX, ESI, EDI from stack if they got changed in course of execution of functionB. (This step is optional. Only done on a need basis)
functionB doesn’t require local vars and temporary storage anymore. So it sets back stack pointer back to EBP.
Pops ESP contents (which is actually functionA’s EBP value) to EBP. This operation brings back functionA’s stack frame.
For returning back to functionA, functionB has to perform one last operation. functionB pops out the return address from stack and jumps to that address.

NOTE 1: Above steps 3 and 4 are done using ‘leave’ instruction of Intel x86.

NOTE 2: step 5 is done using ‘ret’ instruction

Steps done by functionA post return from functionB

functionA doesn’t require the parameters pushed onto stack before calling functionB. So here it can get rid of these by setting the ESP accordingly.
functionA saves the return value from EAX register
functionA restores back EAX, EBX and ECX registers if required.

Below is the C code used for demonstrating stack frames:

int functionB(int i)
{
	int j;
	
	// do something - compute j from i
	
	return j;
}

int functionA(int x, int y)
{
	int a;
	int b;
	
	// do some computation
	
	// calls functionB()
	b  = functionB(a);
	
	return b;
}

Below is the stack frame generated for the sequence functionA calls functionB returns to functionA.

The below assembly code generated by GCC shows these operations in detail. I have added comments for properly describing what’s going on at each line. Text following semicolon ‘;’ is a comment.

_functionB:
    pushl    %ebp           ; saves EBP of caller function 
                            ; - i.e. EBP of functionA to stack
    movl    %esp, %ebp      ; set EBP with current ESP. 
                            ; i.e. new stack frame is set for functionB
    subl    $16, %esp       ; allocates 12 bytes temporary storage 
                            ; + 4 bytes for one local variable in stack
    movl    -4(%ebp), %eax  ; access local variable 'j' using ebp - 4 
                            ; and copy it to eax
    leave                   ; reverse the stack frame 
    ret                     ; return back to functionB 
_functionA:
    pushl    %ebp           ; saves EBP of caller function
                            ; - i.e. EBP of main to stack
    movl    %esp, %ebp      ; set EBP with current ESP. 
                            ; i.e. new current stack frame is set for functionA 
    subl    $20, %esp       ; allocates 12 bytes temporary storage 
                            ; + 8 bytes for two local variables in stack 
    movl    -4(%ebp), %eax  ; access local variable 'a' using ebp - 4 
                            ; and copy it to eax
    movl    %eax, (%esp)    ; move eax contents to esp. This is used as 
                            ; the input parameter in functionB
    call    _functionB      ; calls functionB
    movl    %eax, -8(%ebp)  ; post return from functionB, 
                            ; saves functionBs return value from eax to 
                            ; local variable 'b' @ ebp - 8
    movl    -8(%ebp), %eax  ; copies value of 'b' to eax as 
                            ; return value from functionA
    leave                   ; reverse the stack frame 
    ret                     ; return back to main

Some extra points to notice here:

Function parameters are pushed into the stack by the caller from right to left. You can see that for functionA stack frame, int y is at EBP + 12 (a higher address) and int x is at EBP +8. That means main function had pushed int y (right most parameter) to stack first and then int x.
Temporary storage for both functions are fixed at 12 bytes. These are probably for saving EBX, ESI and EDI registers, each taking 4 bytes making a total of 12 bytes.
As per x86 C function calling conventions, there are certain registers which are to be saved by caller function such as EAX, ECX, EDX, whereas some registers are callee’s responsibility to save. EBX, ESI, EDI are callee saved registers.

Miscellaneous things related to stacks:

Functions returning more than 4 bytes.

C functions often return 4-byte data. But in certain situations you will have to return data which is more than 4 bytes from a function. A typical example is a function returning a struct data type. This is achievable only through stacks. We all know that return data less than and up to 4 bytes gets stored in EAX register as per Intel convention. But if the return data is more than word length (4 bytes), then EAX or CPU registers cannot be used directly for this.

What will happen under the hood when the below code is compiled:

typedef struct
{
	int i;
	int j;
	int k;
}my_struct;

my_struct func1(void)
{
	my_struct s = {1,2,3};
	return s;
}

If I have to invoke func1 as:

my_struct s0 = func1();

Compiler will automatically change the code (internally) to something as follows:

func1(&s0);

C will not support return value of more than 4 bytes directly. Here, C compiler changes the above function call to func1(&s0); For the programmer, even if func1() is a no argument function, compiler makes func1() to a single argument function. Once this happens, then the return value will get stored in the address given as the first parameter. As you already know parameters are passed in C, via stacks, without stack this trick will not work. Do you remember in C++ all non-static member functions get this parameter, popularly known as ‘this-pointer’ as their first parameter from the compiler automatically? So, this is not anything new. Compilers often change the way you have written the code but will make sure the semantics doesn’t change.

Parameter size promotion to integers:

Another direct effect of stack is on the way C passes function parameters. i.e w.r.t size of the data, all parameters with size less than or up to 4-bytes are promoted to integers i.e. 4-bytes data. Stack operations such as push/pop works only on data of CPU word length size.

Say, if your function is

void my_function(int I, char c, float f);

Then second parameter ‘c’ will be automatically up-sized to 4-bytes. Though it won’t lose any type attributes. It’s just allocated 4 bytes in stack memory. That’s all.

Other place of stack related size promotion is for function’s local variables. For this I will illustrate with an example right below:

void my_function(void)
{
	int i;
	char c;
	float f;
	
	printf("Address of i : 0x%X\n", &i);
	printf("Address of c : 0x%X\n", &c);
	printf("Address of f : 0x%X\n", &f);
	
	return;
}

Output is as below:

Address of i : 0x461CC3C
Address of c : 0x461CC3B
Address of f : 0x461CC34

The memory layout looks like below:

float f	0x461CC34 – 0x461CC37; 4 bytes
char c	0x461CC38 – 0x461CC3B; 4 bytes	But since char occupies only 1 byte, only 3B is used when you refer to ‘c’
int i	0x461CC3C – 0x461CC3F; 4 bytes

Ideally char ‘c’ could have been allocated @ 0x461CC38 itself, but since its allocated in function stack, 4-byte address boundary is a must. If it is allocated in 1 byte, then integer ‘i’ should start from 0x461CC39, which will be okay w.r.t memory utilization. But stack operations, as I told works with 4-byte data i.e CPU word length. You might have noticed, that even though the system is Intel x86 little endian, char ‘c’ value is stored at MSB location 0x461CC3B and not LSB location of 0x461CC38. This is required not to lose information. Actually you need to understand that char is not type-casted to int, but is just allocated in 4-bytes location; i.e. without losing the type information. Only if it is converted to an int type, you need to worry about byte ordering or endianness. Three bytes memory from 0x461CC38 to 0x461CC3A are left as gaps. If there were other 3 char variables declared in the function, they could have been allocated in those gaps by the compiler. This is in a way almost similar to the structure padding.

My point here is, in general data stored in stacks, whether it’s function parameters, or function local variables, will be aligned to word boundaries of the processor.

Varargs and variadic functions in C

You all know very well what are varargs or variadic functions in C. The most famous of them is the printf(). Variadic functions unlike other functions take variable number of arguments. For quickly showing you the meaning:

void normal_function(int arg1, int arg2);
void variadic_function(int arg1, …);

So what does the mysterious ellipsis notation‘…’ tells the compiler?

The presence of ellipsis ‘…’ tells the compiler in fact some critical information.

First, it tells that the function is variadic function and not a normal function.

Second, it makes the compiler to enforce putting parameters into stack.

Third is the corollary of second point. It makes sure none of the arguments of variadic functions are put in CPU registers. Many processors these days provides a set of generic CPU registers which can be used efficiently to put arguments. Arguments on stack requires accessing the memory (RAM), whereas argument on registers are part of CPU (registers within CPU) and are much faster. So arguments put in registers in effect makes function calls much faster.

So putting ‘…’ makes compiler NOT TO put any arguments in registers. Or at least, the last named argument before the variable argument and the following variable list of arguments will not be put in registers, and instead will be put on stack.

For example,

void my_variadic_function(int arg1, int arg2, int arg3, …);

Here, ‘arg3’ is the last named argument before the variable argument list starts.

There is a reason for this. I will explain with the help of a code.

Before that, you all are aware of the famous stdargs.h header file that we use to include in our “.c” files if we need to use variadic function conventions. This file defines the following useful macros, for manipulating the variable list arguments.

Refer to: https://en.wikipedia.org/wiki/Stdarg.h

*Macro name*	*Description*
va_list	is not a macro but a type for iterating arguments
va_start	Start iterating arguments with a va_list
va_arg	Retrieve an argument
va_end	Free a va_list.
va_copy	Copy contents of one va_list to another

My advice, there is no need to by-heart these macros and their meaning, if you actually know the following two things:

‘…’ makes the compiler to put arguments on stack and not registers
If you know how parameters are passed onto functions via stack, then we don’t need to use the stdargs macros.

Implementation of a variadic function without #include <stdargs.h>

Go through the code below:

#include <stdio.h>

int my_variadic_function(const int arg_count, ...)
{
	const int *const argument_stack_address = &arg_count;
	const int *arg_pointer = argument_stack_address + 1;
	
	for(int i = 0; i < arg_count; ++i)
	{
		printf("arg %d: %d\n", i, arg_pointer[i]);
	}
	
	return 0;
}

int main()
{	
	my_variadic_function(5, 10, 20, 30, 40, 50);
	
	return 0;
}

Result:

arg 0 : 10
arg 1 : 20
arg 2 : 30
arg 3 : 40
arg 4 : 50

So what’s happening here? Remember the stack frame and the way, parameters are pushed into stack from right to left order?

Let us understand what is this right to left order of parameters. Stacks grow downwards. i.e. from Higher address to lower address. Right most argument is pushed first making its place higher in the stack while left most (i.e. the very first) argument is pushed last making its place lower in the stack frame.

If you can get the address of the last named-argument, i.e. here it’s the first argument itself which is arg_count, start incrementing the address to get the other arguments from the stack. The first argument is at the lowest portion of parameter list in the stack and last argument is at the highest address region of stack w.r.t parameter list in stack frame.

So the logic is to get the lowest address and keep on incrementing the address to access the other parameters from the stack. The question now is where to stop navigating the stack frame is not mandated by C language. So put some logic in your code for stopping the stack navigation, otherwise you will end up corrupting the stack and can crash your program.

Here the logic that I have used is to mention the size of the variable argument list in the first named-argument ‘arg_count’. The code gets the address of ‘arg_count’ and from that address gets the addresses of the remaining parameters one by one and access those parameters from the stack. The above step iterates for arg_count times! Voila!!! So now you know a way for implementing your own variadic function without the stdargs.h. Processors such as Intel and ARM follows this convention. Make sure you understand the notion of the processor’s stack frame.

NOTE : There should be at least one named argument in the variadic function or else you cannot get the starting address of the parameter list, unless you write some inline assembly code for accessing the stack pointer. You may also need to understand the entire stack frame of the processor!

Stack Corruption

As C uses stack for almost everything related to its execution model, most of the run-time issues in C is most likely to be related to stack corruption.

Issues with use of heavyweight data in functions.

Heavyweight data means data with huge sizes such as structs, arrays, unions, big strings etc. Programmers often do the following:

Passing structs and unions as values instead of references
Defining auto variables (within function scope) of structs/unions/big sized arrays

Now you know that auto variables and function parameters are allocated in function’s stack frame and not in data segment. Declaring/instantiating heavyweight variables within function scope can consume lot of stack and is not recommended in general and in particular for embedded applications. Even in bigger platforms, with a reasonable amount of memory, can you imagine the stack abuse such functions do if they are recursive? Secondly, arrays are infamous for not checking memory bounds. Any rogue code can access locations outside memory bounds. If such arrays are allocated in stack, then their affect is much worse. Let’s see a simple code that shows how a rogue code can corrupt a function stack with a simple mistake.

#include <stdio.h>

void dangerous_arrays(int arg1)
{
	int i = 1;
	int j = 2;
	int k = 3;
	int l = 4;
	
	char array[4] = {0, 0, 0, 0};
	
	printf("before stack corruption i : 0x%X\n", i);
	printf("before stack corruption j : 0x%X\n", j);
	printf("before stack corruption k : 0x%X\n", k);
	printf("before stack corruption l : 0x%X\n", l);
	
	printf("before stack corruption array[0] : 0x%X\n", 0xFF & array[0]);
	printf("before stack corruption array[1] : 0x%X\n", 0xFF & array[1]);
	printf("before stack corruption array[2] : 0x%X\n", 0xFF & array[2]);
	printf("before stack corruption array[3] : 0x%X\n", 0xFF & array[3]);
	
	for(int index = 0; index < 20; ++index)
	{
		array[index] = (char)(index * 10);
	}
	
	puts("---------------------------------------------------");
	
	printf("after stack corruption i : 0x%X\n", i);
	printf("after stack corruption j : 0x%X\n", j);
	printf("after stack corruption k : 0x%X\n", k);
	printf("after stack corruption l : 0x%X\n", l);
	
	puts("");
	
	for(int index = 0; index < 20; ++index)
	{
		printf("after stack corruption array[%d] : 0x%X\n", index, 0xFF & array[index]);
	}
	
	return;
}

int main()
{
	dangerous_arrays(0);
	
	return 0;
}

Result is as follows:

before stack corruption i : 0x1
before stack corruption j : 0x2
before stack corruption k : 0x3
before stack corruption l : 0x4
before stack corruption array[0] : 0x0
before stack corruption array[1] : 0x0
before stack corruption array[2] : 0x0
before stack corruption array[3] : 0x0
---------------------------------------------------
after stack corruption i : 0xBEB4AAA0
after stack corruption j : 0x968C8278
after stack corruption k : 0x6E645A50
after stack corruption l : 0x463C3228
after stack corruption array[0] : 0x0
after stack corruption array[1] : 0xA
after stack corruption array[2] : 0x14
after stack corruption array[3] : 0x1E
after stack corruption array[4] : 0x28
after stack corruption array[5] : 0x32
after stack corruption array[6] : 0x3C
after stack corruption array[7] : 0x46
after stack corruption array[8] : 0x50
after stack corruption array[9] : 0x5A
after stack corruption array[10] : 0x64
after stack corruption array[11] : 0x6E
after stack corruption array[12] : 0x78
after stack corruption array[13] : 0x82
after stack corruption array[14] : 0x8C
after stack corruption array[15] : 0x96
after stack corruption array[16] : 0xA0
after stack corruption array[17] : 0xAA
after stack corruption array[18] : 0xB4
after stack corruption array[19] : 0xBE

If you understand how local variables are stored in C, then you will understand that array[4] is actually the lsb of ‘int l’ and array[7] is the msb of ‘int l’ and so on. In simple terms, rogue array has encroached the memory of the neighboring variables. This is an example of stack corruption. Did you figure out the bug in the code? While setting the values of array, the condition I have given in for loop is

index < 20

It should have been index < 3

When using arrays of known size, here are some recommendations:

If within loops make sure you use stop condition based on sizeof(array)/sizeof(data).
Or use a MACRO/ constant data for storing the size of array and use that in the code

Safer use of arrays:

char array[10]; // hardcoded size for array is 10

// safer array use option 1; using sizeof operator
for(int index = 0; index < size(array)/sizeof(char); ++index)
{
	array[index] = index * 10;
}

// safer array use option 2; using a constant size information 
const int size = 4;
char array[size];

for(int index = 0; index < size; ++index)
{
	array[index] = index * 10;
}

// safer array use option 3; using a macro for size information
#define SIZE 4
char array[SIZE];

for(int index = 0; index < SIZE; ++index)
{
	array[index] = index * 10;
}

//If passing arrays into functions, follow the convention I have shown below:

// the parameter declaration 'int array[ ]' explicitly conveys that its as array 
void do_some_operation(int array[], const int size) 
{
	for(int index = 0; index < size; ++index)
	{
		// do something using array here
	}
	
	return;
}

// I don't recommend the following at all.
void do_some_operation(int *array){...}
void do_some_operation(int *array, const int size){...}

The parameter declaration ‘int array[]’ is actually making use of pointers. It’s not pass by value. It’s still pass by reference.

Enjoyed the chapter? Let me know in the comments below. Thanks! 🙂

April 24, 2020May 31, 2020 Deepesh Menon

Chapter 3 : Unions

Key words: Struct and union, bit manipulation using union, bit-fields in C structs

Topics at a glance:

The wonders of union:
- Vagaries of behavior
- Same memory, different perspectives
- Elegant use of unions for mitigating bit manipulation
Struct datatype and it’s mysterious bit-field

Struct and Union

After arrays and their inherent address abstraction mechanisms, I’ll now turn my focus to unions and structures. They are also a category of composite data types in C. Unions and structures can hold members belonging to varying data types, whereas for arrays, members should be belonging to the same data type. Let’s start with unions.

Unions are also known as super variables. I guess in ‘C’, union is the only data structure defined by the language which exhibits vagaries of behavior. A union, can at times behave as an integer data type, at a different time, the same union can behave as a float data type, or a char data type or at sometimes even as arrays or even structures. That is why I told unions exhibit behavioral changes, or, we can put it like this :- unions exhibits multiple-personality. I will demonstrate it for you with an example.

typedef union 
{
	int i;
	char c;
	float f;
}my_union;

my_union u1;


u1.i = 5; // from here onwards u1 behaves as an integer till line 14
...

u1.f = 2.57F; // from here onwards u1 behaves as a float till line 17
...

u1.c = 'a'; // from here onwards u1 behaves as a character and so on...
...

In the above example, once you assign 2.567F to member ‘f’, the previously assigned integer value to ‘i’ will get changed. It will be replaced by the IEEE 754 floating point equivalent of 2.567F. i.e. once you assign a valid data to any member of the union, from there on, that union will behave as a variable belonging to that member’s data type, until the next data is assigned. You will run into programming horrors when you try to access any other member of union at this point, such as accessing ‘i’ after assigning a float value to ‘f’, accessing ‘f’ after assigning a char value to ‘c’ etc. The main reason for this issue is actually the most important feature of a union.

A union always occupy the same block of memory regardless of the members declared inside it.

The below code demonstrates this. I have declared a union type as follows:

typedef union 
{
	int i;
	float f;
	char c;
}my_union;

I have written the following code to manipulate this union:

my_union u1;
char *byte;
int b_count;

// use as integer
puts("As integer");
u1.i = 5;

byte = (char*)&(u1.i);
for(b_count = 0; b_count < sizeof(my_union); ++b_count)
{
	printf("Byte %d : 0x%X\n", b_count, byte[b_count]);
}

// use as float
puts("As float");
u1.f = 5.125F;

byte = (char*)&(u1.f);
for(b_count = 0; b_count < sizeof(my_union); ++b_count)
{
	printf("Byte %d : 0x%X\n", b_count, (byte[b_count]&0xFF));
}

// use as char
puts("As char");
u1.c = 'A';

byte = (char*)&(u1.c);
for(b_count = 0; b_count < sizeof(my_union); ++b_count)
{
	printf("Byte %d : 0x%X\n", b_count, (byte[b_count]&0xFF));
}

puts("\nWhere is the union 'u1' stored ?\n");
printf("Address of u1.i is 0x%X\n", &u1.i);
printf("Address of u1.f is 0x%X\n", &u1.f);
printf("Address of u1.c is 0x%X\n", &u1.c);

The above code produced the following result when run:

As integer
Byte 0 : 0x5
Byte 1 : 0x0
Byte 2 : 0x0
Byte 3 : 0x0
As float
Byte 0 : 0x0
Byte 1 : 0x0
Byte 2 : 0xA4
Byte 3 : 0x40
As char
Byte 0 : 0x41
Byte 1 : 0x0
Byte 2 : 0xA4
Byte 3 : 0x40

Where is the union 'u1' stored ?

Address of u1.i is 0x461CC44
Address of u1.f is 0x461CC44
Address of u1.c is 0x461CC44

When used as integer, printing the contents of union shows how 0x5 is stored in 4 bytes of memory in a typical little endian system (i.e. LSB first) (Refer to chapter 1 for more details on how integer is stored)
When used as float, it shows how 5.125F is stored in IEEE-754 single precision floating format in little endian system. (Refer to chapter 1 for more details on how float (IEEE 754) is stored)
When used as character, it shows how ‘A’ (ascii value of character ‘A’ is 0x41) is stored in memory
Results also show that when a union is used as char, accessing union as a 4-byte data causes undesirable effects. When you observe the results for char you can see that bytes 2 and 3 of union is still retaining remnants of older float value. This happens as all the members of union is getting stored in the exact same memory (here, 0x461CC44).
Unions are not self-managing. Programmer should be very careful on the current state of union and should treat union as a data type appropriate in that state. In the above example, after copying a character value of ‘A’ to union, code should not use the union as a 4-byte data type. If you want to change the type, then re-assign some value to a member of suitable data type.
The language just gives you a facility to use the same memory for storing different type of data, but of course not at the same time. In course of execution union’s behavior change, i. e. it takes the shape of the last assigned data type.

It is programmer’s responsibility to use the union wisely!

Bit manipulation using union

I will explain a typical use case where unions are most appropriate choice for a programmer. Any guess?

If you are an embedded programmer it would have struck your mind. I am talking about manipulating register values of a typical processor.

Bit-fields in C structs: A use-case

The use of union in conjunction with structure’s bit fields is a very powerful programming idiom in embedded world!

Please see the below declaration of a union.

Note: The bit ordering and alignment is implementation (underlying platform) dependent. Understand the word attributes such as byte alignment/ordering etc of your target platform and compiler support for bit-fields. Here, I have considered target platform as Intel x86 32 bit CPU and is compiled using gcc 7.4.0.

typedef char byte;

#define REG8_MASK 0x000000FFU

typedef union register_8_bits__
{
	byte value;
	
	struct bits__
	{
		byte b0 : 1;
		byte b1 : 1;
		byte b2 : 1;
		byte b3 : 1;
		byte b4 : 1;
		byte b5 : 1;
		byte b6 : 1;
		byte b7 : 1;
	}bits;
	
	struct nibbles__
	{
		byte low : 4;
		byte high : 4;
	}nibbles;
	
}register_8_bits;

See how the code easily and naturally manipulates register contents w.r.t bytes, bits and nibbles!!!

register_8_bits my_reg;
my_reg.value = 0x00; // clear register 

printf("Register value : %X\n", my_reg.value & REG8_MASK);

my_reg.bits.b2 = 1; // sets b2
my_reg.bits.b7 = 1; // sets b7

printf("Register value : %X\n", my_reg.value & REG8_MASK);
printf("Register value low : %X\n", my_reg.nibbles.low &0xF);
printf("Register value high : %X\n", my_reg.nibbles.high &0xF);

The result of the above code is as below:

Register value : 0
Register value : 84
Register value low : 4
Register value high : 8

Awesome, isn’t it ?

Let’s understand how this is happening? But before that, let me tell you the significance of this idiom in embedded world.

Usually programmers perform bit set/clear, nibble manipulation etc. using language’s bit manipulation operators such as bit left shit <<, bit right shift >>, Hexadecimal/Binary MASKS such as 0xFF and various other combinations. In embedded programming, updating register contents is just a routine task. Let us see how a C program is written to manipulate bits using typical C bit manipulation convention.

char my_reg; // 8 bit value. 

// Assume that bits are packed with msb b7 in left to lsb b0 in right

// for setting bit 2 and not changing other bit positions
my_reg = my_reg | 0x04;     //0x4 is '0100' in binary

// for clearing bit 5
// 0x20U which is 0010 0000 in binary;
// note that ~ operator is for getting
// the number's 1's complement.							
// ~0x20 => 1101 1111 in binary = 0xDF	 
my_reg = my_reg & ~(0x20); 

// a different way using bit shift << operation 
// for setting b1
my_reg = my_reg | (0x01 << 1);
// or in a little cryptic way using C's shorthand notation
my_reg |= (0x01 << 1);

Now using our union approach the same operations can be done in a more natural or simple way.

//The same operations using the above union
register_8_bits my_reg;

// clearing register to all 0
my_reg.value = 0x0;

// for setting bit 2
my_reg.bits.b2 = 1;
printf("b2 set, Register value : 0x%X\n", my_reg.value);

// for clearing bit 2 
my_reg.bits.b2 = 0;
printf("b2 cleared, Register value : 0x%X\n", my_reg.value);

// for setting b1
my_reg.bits.b1 = 1;
printf("b1 set, Register value : 0x%X\n", my_reg.value);

// for setting high nibble bits to '1110'
my_reg.nibbles.high = 0xE; // 0xE is 1110 in binary
printf("nibble high is set to 0xE, Register value : 0x%X\n", my_reg.value & 0xFF);
// NOTE: Mask with 0xFF is required as we use %X (4bytes) to print the result inside printf()

// for setting the entire 8 bits to 0xFF
my_reg.value = 0xFF;
printf("Value set to 0xFF, Register value : 0x%X\n", my_reg.value & 0xFF); 
// NOTE: Mask with 0xFF is required as we use %X (4bytes) to print the result inside printf()

Let us see the result of the above code using union approach:

b2 set, Register value : 0x4
b2 cleared, Register value : 0x0
b1 set, Register value : 0x2
nibble high is set to 0xE, Register value : 0xE2
Value set to 0xFF, Register value : 0xFF

Now, what you say? Which approach is better? I will definitely select union approach over the direct bit manipulation approach, as union approach is more intuitive.

Just few points on C’s bit-fields:

Make sure that your C compiler supports bit fields properly. Almost all standard C compilers such as gcc, clang and MSVC supports bit fields. Bit fields are defined in the C standard and is a very handy tool at times.
Byte ordering (endianness), bit ordering, alignment etc. are platform/implementation dependent, as I’ve mentioned above.3.
Never try to get address of members declared with bit-field. It results in undesirable behavior. Most of the compilers will not allow you to use ‘&’ i.e address of operation on bit-fields. At least, they’ll warn us what so ever.

Let’s see what is happening under the hood with the above union – register_8_bits? The answer is simple; Unions in ‘C’ guarantees that the compiler will allocate the exact same memory region for all the members declared inside. Here, the 8 bit value, member struct bits and nibbles are all allocated to the same memory region. With C’s struct datatype’s bit field, C compiler guarantees that only the amount of ‘N’ bits specified after ‘:’ will be referred when that specific struct member is used in the code. i.e. bit-fields allow you to refer to a variable’s bit position specifically.

So if you are very prudent about bit positions in your code, there is a direct language support.

By placing the members aptly inside a union along with struct’s bit-fields everything else will fit into the puzzle!

Enjoyed this chapter? Let me know in the comments below. Thanks! 🙂

April 23, 2020May 31, 2020 Deepesh Menon

Chapter 2 : Address Abstraction

Key words: Address abstraction in C, arrays in C, array indexing operation, negative indexing operation on arrays, bound check issues with arrays, function pointers in C, function types in C.

Topics at a glance:

Address abstraction through arrays.
Array size – does it really a matter of concern to the compiler or to the programmer ?
Negative indexes on arrays. Seriously! Does it work ?
Function pointers and how to use them correctly?

In this chapter, let us understand how composite data types are managed in ‘C’. ‘C’ supports the following composite datatypes:

Array
Union
Struct

A composite data type is an aggregate of one or more, data types. Hence, it is also known as ‘aggregate data type’. Various combinations are there. To name a few, arrays of primitive data types, structures and unions of different primitive data types etc. Composite data types can be often built from other composite data types such as structures of structures, arrays of structures, structure comprising of an array of another structure. The list goes on.

Address abstraction in C

We need to understand the notion of Address Abstraction in ‘C’. In a way, I am referring to pointers in ‘C’. But let us not start with pointers directly. Let’s get there slowly, beginning with array and its address abstraction mechanisms.

Arrays in C

An array is a collection of data of the same type, like, array of integers, array of floats, array of structures etc. An array stores its members always in contiguous memory locations, i.e. one after the other.

Consider the following array of integers.

int array_of_int[5] = {1, 2, 3, 4, 5};

Let’s see how this is stored in an arbitrary memory location starting from 0x1000. NOTE: I will not be explaining how an integer itself is stored in memory. For that please refer to chapter 1. Here, the focus is on a collection of integers and not an integer alone.

Array indexing operation

I have shown you how an array of five integers are created in memory. Now, let’s see how to access members of this array one by one. ‘C’ has defined indexing operation for accessing array members. i.e. we can index to a particular element location in an array.

array_of_int[0] will give ‘1’, 
array_of_int[1] will give ‘2’ 
array_of_int[4] will give ‘5’.

When we use indexing operation on an array (i.e. array[index]), programmer doesn’t need to directly manipulate addresses, the C compiler automatically does that for you.

Thus,

array_of_int[0] access the contents at location 0x1000
array_of_int[1] access the contents at location 0x1004
array_of_int[2] access the contents at location 0x1008
array_of_int[3] access the contents at location 0x100C
array_of_int[4] access the contents at location 0x1010

So, the question is, how the compiler knows how to find the location when using indexing operation on an array? When you look at the attributes of array, answer will become clear. Array stores members in contiguous address locations i.e. one after the other. Now the question is why arr_of_int[1] is at location 0x1004 and not at 0x1001 and array_of_int[2] is at 0x1008 and not at 0x1002. As I have explained in previous section, ‘C’ knows very well how to store an integer. Integers take 4 bytes in a typical 32-bit machine. ‘array_of_int[N]’ means, to access Nth member of array_of_int. If it is an array of integers, then to find the location, use the formula below.

Location of array_of_int[N] = Location of the very first element of the array + (N x size of int); Note that am talking about an array of integers here.

For instance, location of array_of_int[3] can be found as:

location of array_of_int[3] = 0x1000 + (3 x 4) = 0x1000 + 0xC (because, 12 in decimal is 0xC in hexadecimal)

Thus, location of array_of_int[3] is 0x100C.

Now, take another example where there is an array of chars. Say, arr_of_char[5]. Here, if we want to get the location of array_of_char[3], assuming array_of_char’s first member is stored at 0x2000.

location of array_of_char[3] = 0x2000 + (3 x 1) = 0x2000 + 0x3 = 0x2003; (size of char is 1 byte)

This doesn’t stop at primitive datatype.

Assume there is a structure struct my_struct of size 16 bytes. Assume there is an array of my_struct as below:

struct my_struct array_of_my_struct[5];

Here, assuming, that first member is stored at 0x3000

location of array_of_my_struct [3] = 0x3000 + (3 x 16) = 0x3000 + 0x30 (48 in decimal is 0x30 in hex) Therefore, location of array_of_my_struct[3] = 0x3030

Array manipulates addresses without bothering the programmer much! That’s the simplicity of an array.

At least, some of you may wonder, how the address of first element is obtained? For that, let’s re-visit the following code from previous section.

int i = 0x5; // hexadecimal ‘5’

Let’s understand now what is a variable. In the above code, variable name is ‘i’, and memory block starting from 0x1000 to 0x1003 has been named by the alias ‘i’. So, when ‘C’ compiler sees a direct reference to this alias further in the code, it knows that it’s the memory block of 4-bytes from 0x1000 to 0x1003. Variable names are always associated with addresses internally. By referring to a variable’s name, programmer is actually referring to the specific memory location where ‘C’ has stored the data.

i.e. A variable’s name is actually an alias to a specific memory location.

But when it comes to array, programmer doesn’t declare names for each location/members of the array.

Here, the array name is just ‘array_of_int’; The ‘[]’ indicates the compiler that ‘array_of_int’ is an array.

int array_of_int[5] = {1, 2, 3, 4, 5};

The most important thing for array name is this. Unlike, a primitive data type, or structs or unions, for an array, its name or alias, is made ONLY for the first element of the array.

Here, in this example, it is not the memory block ranging from 0x1000 to 0x1013 is aliased as ‘array_of_int’. It is just the first member’s location i.e. 0x1000 which stores an integer, in this example.

NOTE: In case of structs, which is an example of packed data in ‘C’, we have to individually name the data members, and the entire packed data bytes is aliased as the struct variable’s name. This is not the case with arrays.

Bound check issues with arrays

Whenever a programmer refers to ‘array_of_int’, it will get associated with memory location 0x1000. The alias ‘array_of_int’ is only for first member’s location, i.e. 0x1000, and not for a specific range of memory from 0x1000 to 0x1013. By associating alias ‘array_of_int’ with an index of ‘6’ or ‘7’, the compiler, simply generates the code to access a location with address 0x1018 and 0x101C respectively. Syntactically it is correct. But Semantically (behaviorally) it is only PLAIN WRONG!

The biggest problem with arrays is this. Array variable name, DOES NOT CHECK for memory bounds.

Again consider array_of_int stored at 0x1000.

Can you guess why array index always starts from 0 in C? Remember the formula for accessing members through array indexing operation? Just substitute index as 0 in that formula to get the very first member of an array. You will get the location pointed by alias array_of_int i.e 0x1000.

i.e. Location of array_of_int[0] is the address associated with the alias array_of_int + ( 0 x 4 ), which is 0x1000. Just understand why it HAS TO start from 0.

I would like to point out one more thing before concluding on arrays. We know very well that array[index] can be used to access members at a specific location in the array. After going through the sections above, you got an idea of how addresses are manipulated using array names and indexing operation.

This is what happening under the hood when you perform indexing operation on an array:

Array name gives the address of first member of the array i.e. start of array
Index value tells the C compiler to access a specific location within the array
Compiler generates the required amount of offset from start of array obtained in step 1 using the formula index multiplied by size of an array member
Compiler finally adds the offset obtained in step 3 with start of array address obtained in step 1, to get the final address

The above four steps is what an indexing operation on array names do. So in simple terms indexing is just this:

Origin of array + required offset; i.e. a simple addition.

Anybody who knows that A + B = B + A, can see how,

Origin + Offset = Offset + Origin

Now, for the above theorem, I can state the corollary also

i.e. array[index] = index[array]

The following code snippet produces the exact same results for both array[index] and index[array] forms.

#include <stdio.h>

int main()
{
	int arr_of_int[5] = {1, 2, 3, 4, 5};

	printf("arr_of_int[0] : %d\n", arr_of_int[0]);
	printf("arr_of_int[1] : %d\n", arr_of_int[1]);
	printf("arr_of_int[2] : %d\n", arr_of_int[2]);
	printf("arr_of_int[3] : %d\n", arr_of_int[3]);
	printf("arr_of_int[4] : %d\n", arr_of_int[4]);

	puts("-------------------------------------");

	printf("0[arr_of_int] : %d\n", 0[arr_of_int]);
	printf("1[arr_of_int] : %d\n", 1[arr_of_int]);
	printf("2[arr_of_int] : %d\n", 2[arr_of_int]);
	printf("3[arr_of_int] : %d\n", 3[arr_of_int]);
	printf("4[arr_of_int] : %d\n", 4[arr_of_int]);
	
	return 0;
}

Output of the above code:

arr_of_int[0] : 1
arr_of_int[1] : 2
arr_of_int[2] : 3
arr_of_int[3] : 4
arr_of_int[4] : 5
-------------------------------------
0[arr_of_int] : 1
1[arr_of_int] : 2
2[arr_of_int] : 3
3[arr_of_int] : 4
4[arr_of_int] : 5

Negative indexing operation on arrays

To add one more interesting thing for your enthusiastic mind, let me ask you one question. What happens if you use a negative index on array. i.e array[ – index ]. ‘C’ compiler generates a code to access a location pointed by start of array – required offset. (note the minus ‘–‘ sign). Of course, you really never would require to do it. But understand that this is how negative index works on array.

Let me explain negative indexing operation with an example.

Say,

int array[3] = {1, 2, 3};
int *pointer = &(array[1]);

pointer[- 1] gives you ‘1‘ which is actually array[0]. I hope you understood the way it is working.

I have not explained pointers yet. But just know that indexing operation is defined in C language for special variable’s that holds address. Pointers are examples for such variables. If you have correctly understood this chapter, then you will know that array names are also special variables that holds address to the first member of array. So, indexing operation is valid not only for array names, but also for any valid “pointers to data” in C. There are pointers to locations holding other than data in C. I am referring to function pointers. Just one advice. Don’t do indexing operations and address manipulations with function pointers. If you come across a function pointer in ‘C’ the only thing you should do with that, is to invoke the function using the function pointer.

Function pointers in C

Normal pointers are special variables in ‘C’ that can hold addresses to data, whereas function pointers are special variables that can hold addresses to functions. Let us walk through a code and see how all we can use a function pointer in C:

#include <stdio.h>

typedef void function_type(int a);
typedef void (*function_ptr_type)(int a);
void (*function_ptr)(int a); // not a type definition, 

void function1(int i)
{
	printf("i : %d\n", i);
}

int main()
{
	// &function1 is not required, as function name, 
	// just like array names gives function's address
	function_type *f_ptr_1 = function1; 
	f_ptr_1(1);
	
	function_ptr_type f_ptr_2 = function1;
	// or simply, f_ptr(2). No need to explicitly dereference. 
	(*f_ptr_2)(2); 						
	
	// Here function_ptr is actually 
	// a function pointer variable 
	function_ptr = function1;			
	function_ptr(3);	
	
	return 0;
}

Result:

i : 1
i : 2
i : 3

Function Pointers and Function Types

To get a function’s address either use ‘&function_name’ or simply ‘function_name’
In the above code all the three approaches of invoking function1() is not direct, but indirectly through function pointers.
When you invoke a function through a function pointer, both the following approaches are valid:
- a_function_pointer()
- (*a_function_ptr)()
‘function_type’ is just a type definition for a function. To use this, we need to define a pointer of type ‘function_type’
‘function_ptr_type’ is a type definition for a function pointer. To use this, we need to define one variable of type function_ptr_type
‘function_ptr’ is NOT a type but a variable itself in global scope. You cannot instantiate another variable of type ‘function_ptr;

When to use function pointer?

I will explain a simple use case of function pointers in C. Function pointers are used widely to implement callback in ‘C’. Callbacks are like a contract between two software entities. Party B signing the contract asks the other party A to invoke this particular function whenever any specific event happens. Party A invokes this function as a regular function but through this function pointer. That means party A gets the event, and party B is notified through this callback mechanism. When this function is executed, party B assumes that the event has triggered. This can also be used for delegation.

Enjoyed this chapter? Let me know in the comments below. Thanks! 🙂

April 22, 2020May 31, 2020 Deepesh Menon

Chapter 1 : C’s Abstraction Mechanisms

Key words: data abstraction in C, fundamental data types in C, little endian and big endian byte ordering, extended sign bit representation, 1’s and 2’s complement, IEEE 754, single precision floating point representation, storing negative floats

Topics at a glance:

The simplicity of ‘C’ and it’s power – the art of data abstraction
Storing Integers – ordering bytes and packing them together
Storing negative numbers – adding a negative sign ‘-‘ is not just enough
Floating point numbers and IEEE 754 format

What is so interesting about the C programming language? It’s simplicity and power. It works so close to the hardware. The direct manipulation of addresses through pointers is what makes C so powerful and flexible. We all know, what are pointers and how useful they are for programmers. But there are various other features also, that ‘C’ provides, and it is imperative to know them to understand the philosophy of ‘C’.

C is a procedural programming language. There are essentially two aspects that a programmer should know when we talk about C’s realization of this programming paradigm.

Aspect 1: Data abstraction

Aspect 2: Execution model

Before going into further explanation, let me set some context right. Go through the points below.

For all my analysis, I have taken a 32-bit machine, unless specified otherwise.
32-bit is also grouped as 4 bytes, 1 byte being 8 bits.
For representation of data, hexadecimal number system is used instead of binary, for convenience.

For a quick reference go through the table below. It shows decimal, binary and hexadecimal values of numbers from 0 to 100.

Decimal	Binary	Hexadecimal	Decimal	Binary	Hexadecimal	Decimal	Binary	Hexadecimal	Decimal	Binary	Hexadecimal
0	0	0	25	11001	19	50	110010	32	75	1001011	4B
1	1	1	26	11010	1A	51	110011	33	76	1001100	4C
2	10	2	27	11011	1B	52	110100	34	77	1001101	4D
3	11	3	28	11100	1C	53	110101	35	78	1001110	4E
4	100	4	29	11101	1D	54	110110	36	79	1001111	4F
5	101	5	30	11110	1E	55	110111	37	80	1010000	50
6	110	6	31	11111	1F	56	111000	38	81	1010001	51
7	111	7	32	100000	20	57	111001	39	82	1010010	52
8	1000	8	33	100001	21	58	111010	3A	83	1010011	53
9	1001	9	34	100010	22	59	111011	3B	84	1010100	54
10	1010	A	35	100011	23	60	111100	3C	85	1010101	55
11	1011	B	36	100100	24	61	111101	3D	86	1010110	56
12	1100	C	37	100101	25	62	111110	3E	87	1010111	57
13	1101	D	38	100110	26	63	111111	3F	88	1011000	58
14	1110	E	39	100111	27	64	1000000	40	89	1011001	59
15	1111	F	40	101000	28	65	1000001	41	90	1011010	5A
16	10000	10	41	101001	29	66	1000010	42	91	1011011	5B
17	10001	11	42	101010	2A	67	1000011	43	92	1011100	5C
18	10010	12	43	101011	2B	68	1000100	44	93	1011101	5D
19	10011	13	44	101100	2C	69	1000101	45	94	1011110	5E
20	10100	14	45	101101	2D	70	1000110	46	95	1011111	5F
21	10101	15	46	101110	2E	71	1000111	47	96	1100000	60
22	10110	16	47	101111	2F	72	1001000	48	97	1100001	61
23	10111	17	48	110000	30	73	1001001	49	98	1100010	62
24	11000	18	49	110001	31	74	1001010	4A	99	1100011	63
									100	1100100	64

Now, let us start exploring the philosophical aspects of C.

Aspect 1: Data abstraction

Fundamental data types in C

C support 5 fundamental data types.

‘char’: Obvious candidates are ‘A’, ‘a’, ‘B’, or any 8-bit values
‘int’: for integers such as 0, 1, 2 …
‘float’: for real numbers such as 1.25, 0.123, 1000.505 …
‘double’: For double precision floating point numbers
‘void’: for nothing. (of course void* is a completely different story, which I am saving for a future discussion.)

Little endian and big endian byte ordering

Let us see how these various datatypes are abstracted in ‘C’, starting with integers. I will explain what is happening, from a storage perspective.

The following figure shows how an integer ‘5’ typically gets stored in little endian and big endian systems.

int i = 5; // assume 'i' is stored at an arbitrary address 0x1000

4 bytes are allocated for storing integer
These 4 bytes together is a named location in memory; name is ‘i’
In Little endian systems, it is always LSB first.
In Big endian systems, it is MSB first
Endianness of a system determines the byte ordering.

So, now you can probably guess how the following code is going to work.

int j = 0x12345678 // assume j is stored at address 0x2000

Let’s see how this is different for a char data.

char c = 'A'; //assume c is stored at address 0x3000

Points to note for ‘char’

Only one byte is allocated in memory for storing character
This one byte of memory is named as ‘c’
Regardless of endianness, ‘A’ is stored at 0x3000 in both the systems

Extended sign bit representation

Next, let us see, how negative numbers are getting stored?

 int i_neg = -5; // assume i_neg is stored at address 0x4000

Before depicting how this is stored in memory, let me explain about signed number representation in computers.

Most significant bit (MSb) i.e. bit 31 for sign bit. If set (‘1’) it means the number is negative. If clear (‘0’), it means the number is positive. (Note that a 32-bit data is stored in bits ranging from bit-31 to bit-0)
Remaining 31 bits will be split into two parts. Part 1, from bit 30 to bit n for sign bit extension. Part 2, from bit n-1 to bit 0 for the numbers 2’s complement format

Let us go through the steps for converting ‘-5’ to its binary representation:

Number 5 in binary is 101
1’s complement is 010
Adding 1 to the 1’s complement to get 2’s complement, which is 011
Sign bit is set to 1, as it is negative 5
With sign bit extension it is, 1111 1111 1111 1111 1111 1111 1111 1011

Now, combining this information along with C’s style of storing, the number ‘-5’ i.e. Negative 5 will be stored as depicted below.

int i_neg = -5; //assume 'i_neg' is stored at address 0x4000

IEEE 754 : Single precision floating point representation

Now, let us examine how floating point numbers are stored. Floating point numbers are stored a little differently from normal integers.

Take for example a floating point number 5.125

float f = 5.125; // assume ‘f’ is stored at address 0x5000

We know that binary of ‘5’ is 101. But how to find the binary of 0.125? I will walk you through the algorithm for converting a decimal part of a fractional number to binary.

Consider the following fractional number notation I’ve used here:

x.y

Here, ‘x’ is the integral part and ‘y‘ is the decimal part, which is written after the decimal point ‘.‘

1. Multiply the decimal part by number ‘2′.

2. Integral part of resulting fractional number will be the first digit of fractional binary number.

3. Repeat steps 1 and 2, but now, using only decimal part of the fractional number.

Let us go through the steps, for finding the binary representation of 0.125.

For finding binary of 0.125 using the above algorithm,

Number	Result = Number x 2	Left of decimal point of Result
0.125	0.250	0
0.250	0.500	0
0.500	1.000	1

Therefore, 0.125 is 0.001 in binary

i.e 5.125 is 101.001 in binary representation

Now, let us see how ‘C’ stores and manipulates floating point numbers:

Any standard ‘C’ compiler, unless specified otherwise, uses the well-known IEEE 754 format (see the figure) for representing floating point numbers.

Now, to represent in IEEE 754 format, we need three parts, as depicted in the figure.

One sign bit: ‘0’ for positive floats, ‘1’ for negative floats
Exponent – 8 bits (Biased exponent, derived from normalized mantissa)
Normalized mantissa – 23 bits

Steps for getting exponent and mantissa parts:

In normalized mantissa only a single bit-‘1’ to the left of decimal point is allowed.

Therefor rewriting, 5.125 which is 101.001 in binary as 1.01001 x 2^2 , to get the normalized mantissa.

Note 1: 2^2 means 2 to the power of 2.

Note 2: Multiplying a number with 2^(n), where n positive, will shift the bits n times towards left.

Note 3: on the contrary, if multiplication is by 2^(-n), then number will get shifted n bits towards the right.

What is a Biased Exponent?

Sometimes, while determining a normalized mantissa part, it will be necessary to multiply by a 2^(negative number). In the above example, we multiplied by 2^(positive 2). But, what if the number was instead 0.0111001? In that case, to get a normalized mantissa, the number will be put as 1. 11001 x 2^(-2). So, raw exponent can be either positive or negative. The idea is to represent both positive and negative raw exponents using 8 bits available. The range of numbers from ‘0 – 255‘, will be split into two halves: -127 to -1 and 0 to 127. We can’t use sign bit binary representation here for negative/positive sign information for the 8-bits exponents. Therefore, a bias of 127 is added to both halves making the numbers back to 0 to 255. I hope you got the trick they have applied here.

To get biased exponent from raw exponent use the following formula:

Biased Exponent = Raw exponent (-ve or +ve) + 127

So, putting all these pieces together, we eventually arrive at the actual representation of 5.125

5.125 is 101.001, which is also 1.01001 x 2^2

Sign bit : 0 (as 5.125 is positive)

Biased Exponent : 2 + 127 = 129 = 10000001 (8 bits)

Normalized mantissa part : 01001

‘sign’ + ‘biased exponent’ + ‘normalized mantissa’ together is 01000000101001.

To get a full 32-bit number append 0’s i.e.

01000000101001 + 000000000000000000 = 01000000101001000000000000000000

This is equal to 0x40A40000 in hexadecimal

All these will culminate to the way C stores a floating point number as depicted below:

float f = 5.125F; // assume 'f' is stored at address 0x5000

Storing negative floats

Let us see how negative float is represented in ‘C’? In IEEE 754, there is a sign bit. For negative floats, just set this bit to ‘1’.

Let us consider a negative floating point number as below:

float f_neg = -5.125F; //assume 'f_neg' is stored at address 0x6000

This is how ‘C’ will store a negative float, -5.125F in memory:

Enjoyed this chapter? Let me know in the comments below. Thanks! 🙂

The C and C++ Club

Let's Learn Together! :)

Chapters on C

Chapter 4 : The Stacks In C

Stacks in C

Execution model of C

Function stack frames in C

Function calls in C

Varargs and variadic functions in C

Stack Corruption

Chapter 3 : Unions

Struct and Union

Bit manipulation using union

Bit-fields in C structs: A use-case

Chapter 2 : Address Abstraction

Address abstraction in C

Arrays in C

Array indexing operation

Bound check issues with arrays

Negative indexing operation on arrays

Function pointers in C

Function Pointers and Function Types

Chapter 1 : C’s Abstraction Mechanisms

Aspect 1: Data abstraction

Fundamental data types in C

Little endian and big endian byte ordering

Extended sign bit representation

IEEE 754 : Single precision floating point representation

Storing negative floats