Friday, January 27, 2012

C++ exception internals

In this post, we'll try to analyze the internals of how exception handling works in C++. Actually, I thought this should be pretty straightforward, given the simplicity of the usage of exceptions in C++. I was wrong. Quite a lot of interesting stuff happens internally, which wasn't very obvious for me at first. So, let's go find out.

So, what is the big deal with exception handling ?

There is one big deal at-least. When an exception is thrown, the exception handling mechanism has to unwind all stacks until it finds a stack which handles it. So, let's first focus on this section of exception handling: throwing an exception.

1. Throwing an exception:

Let's start with the simple class below: 

class A {
public:
  A() {}
  ~A() {}
};

class B {
public:
  B() {}
  ~B() {}
};

class C {
public:
  C() {}
  ~C() {}
};
 
void function1() {
  A a;
  B b;
  C c;
}

int main() {
  function1();
  return 0;
}
 

function1() is our main function of interest here. So, let's disassemble it. Below is the assembly code for it.

(gdb) disassemble function1  
Dump of assembler code for function function1():
   0x080484e4 <+00>:    push   %ebp
   0x080484e5 <+01>:    mov    %esp,%ebp
   0x080484e7 <+03>:    sub    $0x28,%esp
   0x080484ea <+06>:    lea    -0xb(%ebp),%eax
   0x080484ed <+09>:    mov    %eax,(%esp)
   0x080484f0 <+12>:    call   0x804859c <A::A()>
   0x080484f5 <+17>:    lea    -0xa(%ebp),%eax
   0x080484f8 <+20>:    mov    %eax,(%esp)
   0x080484fb <+23>:    call   0x80485a8 <B::B()>
   0x08048500 <+28>:    lea    -0x9(%ebp),%eax
   0x08048503 <+31>:    mov    %eax,(%esp)
   0x08048506 <+34>:    call   0x80485b4 <C::C()>
   0x0804850b <+39>:    lea    -0x9(%ebp),%eax
   0x0804850e <+42>:    mov    %eax,(%esp)
   0x08048511 <+45>:    call   0x80485ba <C::~C()>
   0x08048516 <+50>:    lea    -0xa(%ebp),%eax
   0x08048519 <+53>:    mov    %eax,(%esp)
   0x0804851c <+56>:    call   0x80485ae <B::~B()>
   0x08048521 <+61>:    lea    -0xb(%ebp),%eax
   0x08048524 <+64>:    mov    %eax,(%esp)
   0x08048527 <+67>:    call   0x80485a2 <A::~A()>
   0x0804852c <+72>:    leave 
   0x0804852d <+73>:    ret   
End of assembler dump. 


Okay. All is fine here. Things are pretty straight forward. Constructors for classes A, B and C are called. Their respective destructors are called in the reverse order. None of the constructors or destructors throw an exception here. So, no problems here. Note the leave and ret instructions at the end of the function. They indicate a normal function return, meaning they get executed when we encounter a return statement ( return 10; ) in C / C++.

Now let's modify class "C" as below: 

class C {
public:
  C() {
    throw 1;
  }
  ~C() {}
};


C::C() has changed as below. We call a couple of special functions here __cxa_allocate_exception and  __cxa_throw. We'll cover more on what these functions do later. For now, its just enough to know that they're getting called, when we throw an exception.

(gdb) disassemble C::C()
Dump of assembler code for function C::C():
   0x0804874c <+00>:    push   %ebp
   0x0804874d <+01>:    mov    %esp,%ebp
   0x0804874f <+03>:    sub    $0x18,%esp
   0x08048752 <+06>:    movl   $0x4,(%esp)
   0x08048759 <+13>:    call   0x8048560 <__cxa_allocate_exception@plt>
   0x0804875e <+18>:    movl   $0x1,(%eax)
   0x08048764 <+24>:    movl   $0x0,0x8(%esp)
   0x0804876c <+32>:    movl   $0x804a040,0x4(%esp)
   0x08048774 <+40>:    mov    %eax,(%esp)
   0x08048777 <+43>:    call   0x8048570 <__cxa_throw@plt>
End of assembler dump.

Now, something unexpected has happened here. The code in function1() has changed too, even though we haven't touched a bit in it. Below is the updated function1().

gdb) disassemble function1
Dump of assembler code for function function1():
   0x08048654 <+00>:    push   %ebp
   0x08048655 <+01>:    mov    %esp,%ebp
   0x08048657 <+03>:    push   %ebx
   0x08048658 <+04>:    sub    $0x24,%esp
   0x0804865b <+07>:    lea    -0xb(%ebp),%eax
   0x0804865e <+10>:    mov    %eax,(%esp)
   0x08048661 <+13>:    call   0x8048734 <A::A()>
   0x08048666 <+18>:    lea    -0xa(%ebp),%eax
   0x08048669 <+21>:    mov    %eax,(%esp)
   0x0804866c <+24>:    call   0x8048740 <B::B()>
   0x08048671 <+29>:    lea    -0x9(%ebp),%eax
   0x08048674 <+32>:    mov    %eax,(%esp)
   0x08048677 <+35>:    call   0x804874c <C::C()>
   0x0804867c <+40>:    lea    -0x9(%ebp),%eax
   0x0804867f <+43>:    mov    %eax,(%esp)
   0x08048682 <+46>:    call   0x804877c <C::~C()>
   0x08048687 <+51>:    lea    -0xa(%ebp),%eax
   0x0804868a <+54>:    mov    %eax,(%esp)
   0x0804868d <+57>:    call   0x8048746 <B::~B()>
   0x08048692 <+62>:    lea    -0xb(%ebp),%eax
   0x08048695 <+65>:    mov    %eax,(%esp)
   0x08048698 <+68>:    call   0x804873a <A::~A()>
   0x0804869d <+73>:    add    $0x24,%esp
   0x080486a0 <+76>:    pop    %ebx
   0x080486a1 <+77>:    pop    %ebp
   0x080486a2 <+78>:    ret   
   0x080486a3 <+79>:    mov    %eax,%ebx
   0x080486a5 <+81>:    lea    -0xa(%ebp),%eax
   0x080486a8 <+84>:    mov    %eax,(%esp)
   0x080486ab <+87>:    call   0x8048746 <B::~B()>
   0x080486b0 <+92>:    lea    -0xb(%ebp),%eax
   0x080486b3 <+95>:    mov    %eax,(%esp)
   0x080486b6 <+98>:    call   0x804873a <A::~A()>
   0x080486bb <+103>:    mov    %ebx,%eax
   0x080486bd <+105>:    mov    %eax,(%esp)
   0x080486c0 <+108>:    call   0x8048590 <_Unwind_Resume@plt>

End of assembler dump.
 

Note, that there is some new section of instructions ( in bold ), added after the ret instruction. Looks like some calls to destructors have been added here. Let's analyze on what this exactly means.

void function1() {
  A a;
  B b;
  C c;
}

Now, in the above function, when C c; is read by the compiler, it sees that C's constructor throws an exception. So, it does the following in this case:


1.
In case, if C::C() throws an exception at run-time, it indirectly means that objects a and b have already been constructed successfully. In which case, they SHOULD BE destroyed. So, the compiler inserts calls to call B::~B() and A::~A() ( remember the reverse order ).

2. Also, if C::C() throws an exception at run-time, then object c is not considered to be fully constructed. In that case, it need not be destructed, meaning C::~C() need not be called. So, the compiler doesn't bother about C::~C(). Hence, no calls to C::~C() in the above new code.

3. Since, we are in red alert, we need to convey this to the caller of function1() ( which is main() here ) too. So, the compiler calls
_Unwind_Resume function to continue the same steps in the parent function. Note that _Unwind_Resume is perfectly placed in the end of the function, so a sequential execution will pick it up, which means we're not going by the normal leave / ret code path here. We're using a secondary code path.

This is the secret behind how compilers propagate exceptions and unwind stacks. The compiler analyzes the code while compiling, and adds extra code to handle exceptions. This means extra work and extra compile time. This also implies that your library / executable might get a little bigger than normal.

Okay, we covered the simple case of class C throwing an exception. Let's deal with a little more complex case. What happens when constructors A, B and C, all of them could possibly throw an exception ? This means that all 3 of them have the code which could trigger an exception at run-time. We won't know until run-time, who will throw one. The compiler has to generate code in function1() to accommodate all the cases. The expected behavior is listed below:

1. At run-time, if A's constructor throws an exception, then we should just exit the stack frame, without calling any destructor.

2. If B's constructor throws an exception, then we should only call A's destructor.

3. If C's constructor throws an exception, then we should call the destructors of both A and B.

Below is the assembly code generated when all 3 constructors throw an exception. The initial section of function1() has been trimmed of, since it's the same here too.

(gdb) disassemble function1
Dump of assembler code for function function1():
...

...
   0x080486a0 <+76>:    pop    %ebx
   0x080486a1 <+77>:    pop    %ebp
   0x080486a2 <+78>:    ret   
   0x080486a3 <+79>:    mov    %eax,%ebx
   0x080486a5 <+81>:    lea    -0xa(%ebp),%eax
   0x080486a8 <+84>:    mov    %eax,(%esp)
   0x080486ab <+87>:    call   0x804879e <B::~B()>
   0x080486b0 <+92>:    jmp    0x80486b4 <function1()+96>
   0x080486b2 <+94>:    mov    %eax,%ebx
   0x080486b4 <+96>:    lea    -0xb(%ebp),%eax
   0x080486b7 <+99>:    mov    %eax,(%esp)
   0x080486ba <+102>:    call   0x8048768 <A::~A()>
   0x080486bf <+107>:    mov    %ebx,%eax
   0x080486c1 <+109>:    mov    %eax,(%esp)
   0x080486c4 <+112>:    call   0x8048590 <_Unwind_Resume@plt>
End of assembler dump.


If we look into the code above, we could easily sort out the answers by plain reasoning. E.g. To solve (1), jump directly from A's constructor to address 0x080486bf ( so that we don't fall in the line of the calls to B::~B() and A::~A() ). For (2), jump directly from B's constructor to address 0x080486b0 ( so that we fall in the line of the call to A::~A() and not B::~B() ). For (3), jump directly from C's constructor to address 0x080486a3 ( so that we fall in the line of the calls to A::~A() and B::~B() ). 

One might quickly guess, that these addresses can be hard-coded in the constructors to close this whole issue. But, this won't work since these objects ( and their constructors ) might be used in a lot of places, not only in function1(). So, it becomes obvious that this mechanism of finding the address to jump to, should be dynamic. And the code for doing that should be in one of __cxa_allocate_exception or  __cxa_throw, since they are the one's being called once an exception is thrown. Decent guess. So, let's explore what each one of them does a bit.

Till now, we've not answered a question. When we do a 'throw someObject;' how is someObject accessible in a different stack frame ? The stack where the exception originated is already gone .. right ?

Correct. The exception is not allocated in the stack. It is allocated in the freestore. __cxa_allocate_exception is the guy responsible for allocating it. How can you verify it ? Simple. Let's modify the exception thrown by A's constructor a bit. Rather doing a "throw 1", we'll throw an object, which has a big size, and see where it gets allocated.

class Memory {
private:
  int a_[ 1024 * 1024 ];
public:
  Memory() {
    a_[ 1024 ] = 0xaaaabbbb;
  }
};

class A {
public:
  A() {
    throw Memory();
  }
  ~A() {
  }
}; 

Here, we've created a class Memory, whose size is 4 Mbytes ( sizeof(int) * 1024 * 1024 ). Now, let's analyze the constructor of A() to see where does it allocate this object.

Dump of assembler code for function A::A():
   0x0804878a <+00>:    push   %ebp
   0x0804878b <+01>:    mov    %esp,%ebp
   0x0804878d <+03>:    push   %ebx
   0x0804878e <+04>:    sub    $0x14,%esp
   0x08048791 <+07>:    movl   $0x400000,(%esp)
   0x08048798 <+14>:    call   0x80485a0 <__cxa_allocate_exception@plt>
   0x0804879d <+19>:    mov    %eax,%ebx
   0x0804879f <+21>:    mov    %ebx,(%esp)
   0x080487a2 <+24>:    call   0x8048778 <Memory::Memory()>
   0x080487a7 <+29>:    movl   $0x0,0x8(%esp)
   0x080487af <+37>:    movl   $0x8048918,0x4(%esp)
   0x080487b7 <+45>:    mov    %ebx,(%esp)
   0x080487ba <+48>:    call   0x80485b0 <__cxa_throw@plt>
End of assembler dump. 


If the exception Memory() was allocated in the stack, the value of %esp would have been decremented by 4 Mbytes approximately. It's not. %esp is just decremented by $0x14 ( 20 bytes ). So, the Memory() object was not allocated in the stack. If we move our focus on the line in bold, things should be obvious. A value of 4 Mbytes, is passed as an argument to __cxa_allocate_exception function. So, it's this function that allocates the exception dynamically as the name suggests. Still not convinced. Try a simple trick. Break in __cxa_allocate_exception in gdb. Analyze the memory usage of the program ( pmap -x <pid> ). Run the 'finish' command in gdb to complete the execution of __cxa_allocate_exception. Now analyze the memory. It should have increased by 4 Mbytes.

So, __cxa_allocate_exception is responsible for allocation. It returns the allocated address ( in %eax, the return value register ), where the Memory() object is constructed. After this, we call __cxa_throw. 0x8(%esp) is the 3rd argument. 0x4(%esp) is the 2nd argument. (%esp) is the 1st argument.  In short, we call 

__cxa_throw( Memory()'s this pointer, Memory()'s typeinfo, 0 )

So, by elimination if __cxa_allocate_exception only takes care of exception memory allocation, then __cxa_throw is the guy who knows the jumping logic. This is the function where all the trick should happen, logically. This function knows where exactly to land in the previous stack call frame ( function1() in our case ), so that it will avoid calling the wrong destructors.  At this point, I'm too not clear on how it does this. Will defer it for now.

2. Catching an exception: 

Catching is pretty trivial. Whenever we catch an exception, we either have the choice to end the misery, or to propagate ( aka re-throwing ) it with a smile on the face. If you're propagating, then refer to the previous section. If you've handled the exception, go to sleep ;-)

No comments:

Post a Comment