.gccexcepttable
Throwing an exception in C++ requires more than unwinding the stack. As the program unwinds, local variable destructors must be executed. Catch clauses must be examined to see if they should catch the exception. Exception specifications must be checked to see if the exception should be redirected to the unexpected handler. Similar issues arise in Go, Java, and even C when using gcc’s cleanup function attribute.
As I described earlier, each CIE in the unwind data may contain a pointer to a personality function, and each FDE may contain a pointer to the LSDA, the Language Specific Data Area. Each language has its own personality function. The LSDA is only used by the personality function, so it could in principle differ for each language. However, at least for gcc, every language uses the same format, since the LSDA is generated by the language-independent middle-end.
The personality function takes five arguments:
- A int version number, currently 1.
- A bitmask of actions.
- An exception class, a 64-bit unsigned integer which is specific to a language.
- A pointer to information about the specific exception being thrown.
- Unwinder state information.
The exception class permits code written in one language to work correctly when an exception is thrown by code written in a different language. The value for g++ is “GNUCC++\0” (or “GNUCC++\1” for a dependent exception, which is used when rethrowing an exception). The value for Go is “GNUCGO\0\0”. The exception specific information can only be examined if the exception class is recognized.
Unwinding the stack for an exception is done in two phases. In the first phase,
the unwinder walks up the stack passing the action _UA_SEARCH_PHASE
(which
has the value 1) to each personality function that it finds. The personality
function should examine the LSDA to see if there is a handler for the exception
being thrown. It should return _URC_HANDLER_FOUND
(6
) if there is or
_URC_CONTINUE_UNWIND
(8
) if there isn’t. The search phase will continue
until a handler is found or until the top of the stack is reached. The unwinder
will not actually change anything while walking. If the top of the stack is
reached the unwinder will simply return, and the calling code will take the
appropriate action, which for C++ is to call std::terminate
. Because of the
two phase unwinding approach, if std::terminate
dumps core, a backtrace will
show the code which threw the exception.
If a handler is found, the second phase begins. The unwinder walks up the stack
passing the action _UA_CLEANUP_PHASE
(2
) to each personality function. The
unwinder will also set _UA_FORCE_UNWIND
(8
) in the actions bitmask if the
personality function may not catch the exception, because the unwinding is
happening due to some event like thread cancellation. The unwinder will walk up
the stack until it finds the handler—the stack frame for which the personality
function returned _URC_HANDLER_FOUND
. When it calls that function, the
unwinder will pass _UA_HANDLER_FRAME
(4
) in the actions bitmask. This time,
the unwinder will changes things as it goes, removing stack frames.
In order to run destructors, the personality function will call _Unwind_SetIP
on the context parameter to set the program counter to point to the cleanup
routine, and then return _URC_INSTALL_CONTEXT
(7
) to tell the unwinder to
branch to the current context. The address which starts the cleanup is known as
a landing pad. The cleanup should do whatever it needs to do, and then call
_Unwind_Resume
. The exception information needs to be passed to
_Unwind_Resume
. The personality routine arranges to pass the exception
information to the cleanup by calling _Unwind_SetGR
passing
__builtin_eh_return_data_regno(0)
and the exception information passed to the
personality routine. Each target which supports this approach has to dedicate
two registers to holding exception information. This is the first one.
The personality function which finds the handler works pretty much the same
way. It may also use _Unwind_SetGR
to set a value in
__builtin_eh_return_data_regno(1)
to indicate which exception was found. The
exception handler may rethrow the exception via _Unwind_RaiseException
or it
may simply continue a normal execution path.
At this point we’ve seen everything except how the personality function decides whether it needs to run a cleanup or catch an exception. The personality function makes this decision based on the LSDA. As mentioned above, while the LSDA could be language dependent, in practice it is not. There is a different personality function for each language, but they all do more or less the same thing, omitting aspects which are not relevant for the language (e.g., there is a personality function for C, but it only runs cleanups and does not bother to look for exception handlers).
The LSDA is found in the section .gcc_except_table
(the personality function
is just a function and lives in the .text
section as usual). The personality
function gets a pointer to it by calling _Unwind_GetLanguageSpecificData
. The
LSDA starts with the following fields:
- A 1 byte encoding of the following field (a
DW_EH_PE_xxx
value). - If the encoding is not
DW_EH_PE_omit
, the landing pad base. This is the base from which landing pad offsets are computed. If this is omitted, the base comes from calling_Unwind_GetRegionStart
, which returns the beginning of the code described by the current FDE. In practice this field is normally omitted. - A 1 byte encoding of the entries in the type table (a
DW_EH_PE_xxx
value). - If the encoding is not
DW_EH_PE_omit
, the types table pointer. This is an unsigned LEB128 value, and is the byte offset from this field to the start of the types table used for exception matching. - A 1 byte encoding of the fields in the call-site table (a
DW_EH_PE_xxx
value). - An unsigned LEB128 value holding the length in bytes of the call-site table.
This header is immediately followed by the call-site table. Each entry in the call-site table has four fields. The number of bytes in the header gives the total length. Each entry in the call-site table describes a particular sequence of instructions within the function that the FDE desribes.
- The start of the instructions for the current call site, a byte offset from the landing pad base. This is encoded using the encoding from the header.
- The length of the instructions for the current call site, in bytes. This is encoded using the encoding from the header.
- A pointer to the landing pad for this sequence of instructions, or 0 if there isn’t one. This is a byte offset from the landing pad base. This is encoded using the encoding from the header.
- The action to take, an unsigned LEB128. This is 1 plus a byte offset into the action table. The value zero means that there is no action.
The call-site table is sorted by the start address field. If the personality
function finds that there is no entry for the current PC in the call-site
table, then there is no exception information. This should not happen in normal
operation, and in C++ will lead to a call to std::terminate
. If there is an
entry in the call-site table, but the landing pad is zero, then there is
nothing to do: there are no destructors to run or exceptions to catch. This is
a normal case, and the unwinder will simply continue. If the action record is
zero, then there are destructors to run but no exceptions to catch. The
personality function will arrange to run the destructors as described above,
and unwinding will continue.
Otherwise, we have an offset into the action table. Each entry in the action table is a pair of signed LEB128 values. The first number is a type filter. The second number is a byte offset to the next entry in the action table. A byte offset of 0 ends the current set of actions.
A type filter of zero indicates a cleanup, which is the same as an action record of zero in the call-site table. This means that there is a cleanup to be called even if none of the types match.
A positive type filter is an index into the types table. This is a negative
index: the value 1 means the entry preceding the types table base, 2 means the
entry before that, etc. The size of entries in the types table comes from the
encoding in the header, as does the base of the types table. Each entry in the
types table is a pointer to a type information structure. If this type
information structure matches the type of the exception, then we have found a
handler for this exception. The type filter value is a switch value will be
passed to the handler in exception register 1. The actual comparison of the
type information, and determining the type information from the exception
pointer, really is language dependent. In C++ this is a pointer to a
std::type_info
structure. A NULL
pointer in the types table is a catch-all
handler.
A negative type filter is a byte offset into the types table of a NULL
terminated list of pointers to type information structures. If the type of the
current exception does not match any of the entries in the list, then there is
an exception specification error. This is treated as an exception handler with
a negative switch value.
I think that covers everything about how gcc unwinds the stack and throws exceptions.