libdwarf
A Consumer Library Interface to DWARF
Author
David Anderson
Date
2022-11-22 v0.5.1

Suggestions for improvement are welcome.

Your thoughts on the document?

A) Are the section and subsection titles on Main Page meaningful to you?

B) Are the titles on the Modules page meaningful to you?

Anything else you find misleading or confusing? Send suggestions to ( libdwarf-list (at) prevanders with final characters .org } Sorry about the simple obfuscation to keep bots away. It's actually a simple email address, not a list.

Thanks in advance for any suggestions.

Introduction

This document describes an interface to libdwarf, a library of functions to provide access to DWARF debugging information records, DWARF line number information, DWARF address range and global names information, weak names information, DWARF frame description information, DWARF static function names, DWARF static variables, and DWARF type information. In addition the library provides access to several object sections (created by compiler writers and for debuggers) related to debugging but not mentioned in any DWARF standard.

The document has long mentioned the "Unix International Programming Languages Special Interest Group" (PLSIG), under whose auspices the DWARF committee was formed around 1991. "Unix International" was disbanded in the 1990s and no longer exists.

The DWARF committee published DWARF2 July 27, 1993, DWARF3 in 2005, DWARF4 in 2010, and DWARF5 in 2017.

In the mid 1990s this document and the library it describes (which the committee never endorsed, having decided not to endorse or approve any particular library interface) was made available on the internet by Silicon Graphics, Inc.

In 2005 the DWARF committee began an affiliation with FreeStandards.org. In 2007 FreeStandards.org merged with The Linux Foundation. The DWARF committee dropped its affiliation with FreeStandards.org in 2007 and established the dwarfstd.org website.

See also
http://www.dwarfstd.org for current information on standardization activities and a copy of the standard.

Thread Safety

Libdwarf can safely open multiple Dwarf_Debug pointers simultaneously but all such Dwarf_Debug pointers must be opened within the same thread. And all libdwarf calls must be made from within that single (same) thread.

Error Handling in libdwarf

Essentially every libdwarf call could involve dealing with an error (possibly data corruption in the object file). Here we explain the two main approaches the library provides (though we think only one of them is truly appropriate except in toy programs).

A) The recommmended approach is to define a Dwarf_Error and initialize it to 0.

Dwarf_Error error = 0;
struct Dwarf_Error_s * Dwarf_Error
Definition: libdwarf.h:531

Then, in every call where there is a Dwarf_Error argument pass its address. For example:

int res = dwarf_tag(die,DW_TAG_compile_unit,&error);
int dwarf_tag(Dwarf_Die dw_die, Dwarf_Half *dw_return_tag, Dwarf_Error *dw_error)
Get TAG value of DIE.

The possible return values to res are, in general:

DW_DLV_OK
DW_DLV_NO_ENTRY
DW_DLV_ERROR

If DW_DLV_ERROR is returned then error is set (by the library) to a pointer to important details about the error. If DW_DLV_NO_ENTRY or DW_DLV_OK is returned the error argument is ignored by the library.

Some functions cannot possibly return some of these three values. As defined later for each function.

B) An alternative (not recommended) approach is to pass NULL to the error argument.

int res = dwarf_tag(die,DW_TAG_compile_unit,NULL);

If your initialization provided an 'errhand' function pointer argument (see below) the library will call errhand if an error is encountered. (Your errhand function could simply exit if you so choose.)

The the library will then return DW_DLV_ERROR, though you will have no way to identify what the error was. Could be a malloc fail or data corruption or an invalid argument to the call, or something else.

That is the whole picture. The library never calls exit() under any circumstances.

Error Handling at initialization

Each initialization call (for example)

int dwarf_init_path(const char *dw_path, char *dw_true_path_out_buffer, unsigned int dw_true_path_bufferlen, unsigned int dw_groupnumber, Dwarf_Handler dw_errhand, Dwarf_Ptr dw_errarg, Dwarf_Debug *dw_dbg, Dwarf_Error *dw_error)
Initialization based on path, the most common initialization.

has two arguments that appear nowhere else in the library.

Dwarf_Handler dw_errhand
Dwarf_Ptr dw_errarg
void(* Dwarf_Handler)(Dwarf_Error dw_error, Dwarf_Ptr dw_errarg)
Definition: libdwarf.h:631
void * Dwarf_Ptr
Definition: libdwarf.h:221

If you use the suggested A) approach just pass NULL to both those arguments.

Note that dw_errarg is a pointer so one could create a struct with data of interest and use that pointer as the dw_errarg. Or one could put an integer in there or simply NULL, it just depends what you want to do in the Dwarf_Handler function you write.

If you wish to provide a dw_errhand, define a function (this first example is not a good choice as it terminates the application!).

void bad_dw_errhandler(Dwarf_Error error,Dwarf_Ptr ptr)
{
printf("ERROR Exit on %lx due to error 0x%lx %s\n",
(unsigned long)ptr,
(unsigned long)dwarf_errno(error),
dwarf_errmsg(error));
exit(1)
}
char * dwarf_errmsg(Dwarf_Error dw_error)
What message string is in the error?
Dwarf_Unsigned dwarf_errno(Dwarf_Error dw_error)
What DW_DLE code does the error have?

and pass bad_dw_errhandler (as a function pointer, no parentheses. The Dwarf_Ptr argument is the value you passed in as dw_errarg, and can be anything. By doing an exit() you guarantee that your application abruptly stops. This is only acceptable to toy or practice programs.

A better dw_errhand function is

void my_dw_errhandler(Dwarf_Error error,Dwarf_Ptr ptr)
{
/* Clearly one could write to a log file or do
whatever the application finds useful. */
printf("ERROR on %lx due to error 0x%lx %s\n",
(unsigned long)ptr,
(unsigned long)dwarf_errno(error),
dwarf_errmsg(error));
}

because it returns. The DW_DLV_ERROR code is returned from libdwarf and your code can do what it likes with the error situation.

Dwarf_Ptr x = address of some struct I want in the errhandler;
res = dwarf_init_path(...,my_dw_errhandler,x,... );
if (res == ...)

If you do not wish to provide a dw_errhand, just pass both arguments as NULL.

Error Handling Everywhere

So let us examine a case where anything could happen. And here we are taking the recommeded method of using a non-null dwarf_Error*:

int func(Dwarf_Dbg dbg,Dwarf_Die die, Dwarf_Error* error) {
Dwarf_Die newdie = 0;
res = dwarf_siblingof_b(die,&newdie,error);
if (res != DW_DLV_OK) {
return res;
}
/* Do something with newdie. */
newdie = 0;
}
struct Dwarf_Die_s * Dwarf_Die
Definition: libdwarf.h:542
void dwarf_dealloc_die(Dwarf_Die dw_die)
Deallocate (free) a DIE.
int dwarf_siblingof_b(Dwarf_Debug dw_dbg, Dwarf_Die dw_die, Dwarf_Bool dw_is_info, Dwarf_Die *dw_return_siblingdie, Dwarf_Error *dw_error)
Return the first DIE or the next sibling DIE.

If res == DW_DLV_OK, then newdie is a DIE pointer and when appropriate we should do dwarf_dealloc_die(newdie)

If res == DW_DLV_NO_ENTRY, then newdie is not set and there is no error. In this case it means die was the last of a siblinglist. The exact meaning of course depends on the call.

If res == DW_DLV_ERROR then something really bad happened. The only way to know what is to examine the *error as in

int ev = dwarf_errno(*error);
or
char * msg = dwarf_errmsg(*error);
or both and report that somehow.

If it's a decently large program then you want to free any local memory and return res. If a small and unimportant program print something and exit.

If you want to discard the error report from the dwarf_siblingof() call then possibly do

dwarf_dealloc_error(dbg,*error);
*error = 0;
return DW_DLV_OK;
void dwarf_dealloc_error(Dwarf_Debug dw_dbg, Dwarf_Error dw_error)
Free (dealloc) an Dwarf_Error something created.

Except in a special case involving function dwarf_set_de_alloc_flag() (which is not usually called, any dwarf_dealloc() that is needed will happen automatically when you call dwarf_finish() ). Very long running library access programs using relevant appropriate dwarf_dealloc calls should consider calling dwarf_set_de_alloc_flag() to avoid memory bloat.

That's all there is to it.

Extracting Data Per Compilation Unit

The library is designed to run a single pass through the set of Compilation Units (CUs), via a sequence of calls to dwarf_next_cu_header_d() and within a CU initiate a DIE scan using dwarf_siblingof_b(). There is currently no provision for a second pass.

The general plan:

create your local data structure(s)
A. Check your local data structures to see if
you have what you need
B. If sufficient data present act on it,
ensuring your data structures are kept for
further use.
C. Otherwise Read a CU, recording relevant data
in your structures and loop back to A.

For an example

See also
Example walking CUs Write your code to record relevant (to you) information from each CU as you go so your code has no need for a second pass through the CUs. This is much much faster than allowing multiple passes would be.

Line Table Registers

Line Table Registers

Please refer to the DWARF5 Standard for details. The line table registers are named in Section 6.2.2 State Machine Registers and are not much changed from DWARF2.

Certain functions on Dwarf_Line data return values for these 'registers' as these are the data available for debuggers and other tools to relate code addresses to source file locations.

address
op_index
file
line
column
is_stmt
basic_block
end_sequence
prologue_end
epilogue_begin
isa
discriminator

Reading Special Sections Independently

DWARF defines (in each version of DWARF) sections which have a somewhat special character.
These are referenced from compilation units and other places and the Standard does not forbid blocks of random bytes at the start or end or between the areas referenced from elsewhere.

Sometimes compilers (or linkers) leave trash behind as a result of optimizations. If there is a lot of space wasted that way it is quality of implementation issue. But usually the wasted space, if any, is small.

Compiler writers or others may be interested in looking at these sections independently so libdwarf provides functions that allow reading the sections without reference to what references them.

Abbreviations can be read independently

Strings can be read independently

String Offsets can be read independently

Those functions allow starting at byte 0 of the section and provide a length so you can calculate the next section offset to call or refer to.

Usually that works fine. But if there is some random data somewhere outside of referenced areas the reader function may fail, returning DW_DLV_ERROR. Such an error is neither a compiler bug nor a libdwarf bug.

Special Frame Registers

In dealing with .debug_frame or .eh_frame there are a few related values that must be set unless one has relatively few registers in the target ABI (anything under 188 registers, see dwarf.h DW_FRAME_LAST_REG_NUM for this default).

The requirements stem from the design of the section. See the DWARF5 Standard for details.

Keep in mind that register values correspond to columns in the theoretical fully complete table of a row per pc and a column per register.

There is no time or space penalty in setting Undefined_Value, Same_Value, and CFA_Column much larger than the Table_Size.

Here are the five values.

Table_Size: This sets the number of columns in the theoretical table. It starts at DW_FRAME_LAST_REG_NUM which defaults to 188. This is the only value you might need to change, given the defaults of the others are set reasonably large by default.

Undefined_Value: A register number that means the register value is undefined. For example due to a call clobbering the register. DW_FRAME_UNDEFINED_VAL defaults to 12288. There no such column in the table.

Same_Value: A register number that means the register value is the same as the value at the call. Nothing can have clobbered it. DW_FRAME_SAME_VAL defaults to 12289. There no such column in the table.

Initial_Value: The value must be either DW_FRAME_UNDEFINED_VAL or DW_FRAME_SAME_VAL to represent how most registers are to be thought of at a function call. This is a property of the ABI and instruction set. Specific frame instructions in the CIE or FDE will override this for registers not matching this value.

CFA_Column: A number for the CFA. Defined so we can use a register number to refer to it. DW_FRAME_CFA_COL defaults to 12290. There no such column in the table. See libdwarf.h struct member rt3_cfa_rule or function dwarf_get_fde_info_for_cfa_reg3_b .

A set of functions allow these to be changed at runtime. The set should be called (if needed) immediately after initializing a Dwarf_Debug and before any other calls on that Dwarf_Debug. If just one value (for example, Table_Size) needs altering, then just call that single function.

For the library accessing frame data to work properly there are certain invariants that must be true once the set of functions have been called.

REQUIRED:

Table_Size > the number of registers in the ABI.
Undefined_Value != Same_Value
CFA_Column != Undefined_value
CFA_Column != Same_value
Initial_Value == Same_Value ||
(Initial_Value == Undefined_value)
Undefined_Value > Table_Size
Same_Value > Table_Size
CFA_Column > Table_Size

.debug_pubnames etc DWARF2-DWARF4

Each section consists of a header for a specific compilation unit (CU) followed by an a set of tuples, each tuple consisting of an offset of a compilation unit followed by a null-terminated namestring. The tuple set is ended by a 0,0 pair. Then followed with the data for the next CU and so on.

The function set provided for each such section allows one to print all the section data as it literally appears in the section (with headers and tuples) or to treat it as a single array with CU data columns.

Each has a set of 6 functions.

Section typename Standard main function
.debug_pubnames Dwarf_Global DWARF2-DWARF4 dwarf_get_globals
.debug_pubtypes Dwarf_Type DWARF3,DWARF4 dwarf_get_pubtypes
struct Dwarf_Global_s * Dwarf_Global
Definition: libdwarf.h:559
struct Dwarf_Type_s * Dwarf_Type
Definition: libdwarf.h:566
int dwarf_get_pubtypes(Dwarf_Debug dw_dbg, Dwarf_Type **dw_types, Dwarf_Signed *dw_number_of_types, Dwarf_Error *dw_error)
Access to DWARF3, DWARF4 .debug_pubtypes section.
int dwarf_get_globals(Dwarf_Debug dw_dbg, Dwarf_Global **dw_globals, Dwarf_Signed *dw_number_of_globals, Dwarf_Error *dw_error)
Global name space operations, .debug_pubnames access.

The following four were defined in SGI/IRIX compilers in the 1990s but never part of the DWARF standard.

It not likely you will encounter these.

.debug_funcs Dwarf_Func None dwarf_get_funcs
.debug_typenames Dwarf_Type None dwarf_get_types
.debug_vars Dwarf_Var None dwarf_get_vars
.debug_weaks Dwarf_Weak None dwarf_get_weaks
int dwarf_get_weaks(Dwarf_Debug dw_dbg, Dwarf_Weak **dw_weaks, Dwarf_Signed *dw_number_of_weaks, Dwarf_Error *dw_error)
Access to SGI/IRIC .debug_weaks section.
int dwarf_get_funcs(Dwarf_Debug dw_dbg, Dwarf_Func **dw_funcs, Dwarf_Signed *dw_number_of_funcs, Dwarf_Error *dw_error)
Access to SGI/IRIX .debug_funcs section. Static function names and offsets.
int dwarf_get_types(Dwarf_Debug dw_dbg, Dwarf_Type **dw_types, Dwarf_Signed *dw_number_of_types, Dwarf_Error *dw_error)
Access to SGI/IRIX .debug_types section. Static types names and offsets. Pubnames and Pubtypes overvi...
int dwarf_get_vars(Dwarf_Debug dw_dbg, Dwarf_Var **dw_vars, Dwarf_Signed *dw_number_of_vars, Dwarf_Error *dw_error)
Access to SGI/IRIC .debug_vars section. File-scope static variable names Pubnames and Pubtypes overvi...

Reading DWARF with no object file present

This most commonly happens with just-in-time compilation, and someone working on the code wants do debug this on-the-fly code in a situation where nothing can be written to disc, but DWARF can be constructed in memory.

For a simple example of this

See also
Jitreader Demonstrating DWARF without a file.

But the libdwarf feature can be used in a wide variety of ways.

For example, the DWARF data could be kept in simple files of bytes on the internet. Or on the local net. Or if files can be written locally each section could be kept in a simple stream of bytes in the local file system.

Another example is a non-standard file system, or file format, with the intent of obfuscating the file or the DWARF.

For this to work the code generator must generate standard DWARF.

Overall the idea is a simple one: You write a small handful of functions and supply function pointers and code implementing the functions. These are part of your application or library, not part of libdwarf.

You set up a little bit of data with that code (all described below) and then you have essentially written the dwarf_init_path equivalent and you can access compilation units, line tables etc and the standard libdwarf function calls simply work.

Data you need to create involves these types. What follows describes how to fill them in and how to make them work for you.

void* ai_object;
const Dwarf_Obj_Access_Methods_a *ai_methods;
};
int (*om_get_section_info)(void* obj,
Dwarf_Half section_index,
Dwarf_Obj_Access_Section_a* return_section,
int* error);
Dwarf_Small (*om_get_byte_order)(void* obj);
Dwarf_Small (*om_get_length_size)(void* obj);
Dwarf_Small (*om_get_pointer_size)(void* obj);
Dwarf_Unsigned (*om_get_filesize)(void* obj);
Dwarf_Unsigned (*om_get_section_count)(void* obj);
int (*om_load_section)(void* obj,
Dwarf_Half section_index,
Dwarf_Small** return_data, int* error);
int (*om_relocate_a_section)(void* obj,
Dwarf_Half section_index,
int* error);
};
const char* as_name;
Dwarf_Unsigned as_type;
Dwarf_Unsigned as_flags;
Dwarf_Addr as_addr;
Dwarf_Unsigned as_offset;
Dwarf_Unsigned as_size;
Dwarf_Unsigned as_link;
Dwarf_Unsigned as_info;
Dwarf_Unsigned as_addralign;
Dwarf_Unsigned as_entrysize;
};
struct Dwarf_Debug_s * Dwarf_Debug
Definition: libdwarf.h:537
unsigned char Dwarf_Small
Definition: libdwarf.h:217
unsigned short Dwarf_Half
Definition: libdwarf.h:216
unsigned long long Dwarf_Unsigned
Definition: libdwarf.h:209
unsigned long long Dwarf_Addr
Definition: libdwarf.h:212
Definition: libdwarf.h:722
Definition: libdwarf.h:703
Definition: libdwarf.h:685

Dwarf_Obj_Access_Section_a: Your implementation of a om_get_section_info must simply fill in a few fields (leaving most zero) for libdwarf. The fields here are standard Elf, but for most you can just use the value zero. We assume here you will not be doing relocations at runtime.

as_name: Here you set a section name via the pointer. The section names must be names as defined in the DWARF standard, so if such do not appear in your data you have to create the strings yourself.

as_type: Just fill in zero.

as_flags: Just fill in zero.

as_addr: Fill in the address, in local memory, where the bytes of the section are.

as_offset: Just fill in zero.

as_size: Fill in the size, in bytes, of the section you are telling libdwarf about.

as_link: Just fill in zero.

as_info: Just fill in zero.

as_addralign:Just fill in zero.

as_entrysize: Just fill in one.

Dwarf_Obj_Access_Methods_a_s: The functions we need to access object data from libdwarf are declared here.

In these function pointer declarations 'void *obj' is intended to be a pointer (the object field in Dwarf_Obj_Access_Interface_s) that hides the library-specific and object-specific data that makes it possible to handle multiple object formats and multiple libraries. It's not required that one handles multiple such in a single libdwarf archive/shared-library (but not ruled out either). See dwarf_elf_object_access_internals_t and dwarf_elf_access.c for an example.

Usually the struct Dwarf_Obj_Access_Methods_a_s is statically defined and the function pointers are set at compile time.

The om_get_filesize member is new September 4, 2021. Its position is NOT at the end of the list. The member names all now have om_ prefix.

Section Groups. Debug Fission. COMDAT groups

A typical executable or shared object is unlikely to have any section groups, and in that case what follows is irrelevant and unimportant.

COMDAT groups enable compilers and linkers to work together to eliminate blocks of duplicate DWARF and duplicate CODE.

Debug Fission allows compilers and linkers to separate large amounts of DWARF from the executable, shrinking disk space needed in the executable while allowing full debugging (which also applies to shared objects).

See the DWARF5 Standard, Section E.1 Using Compilation Units page 364.

To name such groups (defined later here) we add the following defines to libdwarf.h (the standard does not specify how to do any of this).

/* These support opening DWARF5 split dwarf objects and
Elf SHT_GROUP blocks of DWARF sections. */
#define DW_GROUPNUMBER_ANY 0
#define DW_GROUPNUMBER_BASE 1
#define DW_GROUPNUMBER_DWO 2

The DW_GROUPNUMBER_ are used in libdwarf functions dwarf_init_path(), dwarf_init_path_dl() and dwarf_init_b(). In all those cases unless you know there is any complexity in your object file, pass in DW_GROUPNUMBER_ANY.

To see section groups usage, see the example source:

See also
A simple report on section groups.
Examing Section Group data

The function interface declarations:

See also
dwarf_sec_group_sizes
dwarf_sec_group_map

If an object file has multiple groups libdwarf will not reveal contents of the other groups. One must pass in another groupnumber to dwarf_init_path, meaning init a new Dwarf_Debug, to get libdwarf to access that group.

When opening a Dwarf_Debug the following applies:

If DW_GROUPNUMBER_ANY is passed in libdwarf will choose either of DW_GROUPNUMBER_BASE(1) or DW_GROUPNUMBER_DWO (2) depending on the object content. If both groups one and two are in the object libdwarf will chose DW_GROUPNUMBER_BASE.

If DW_GROUPNUMBER_BASE is passed in libdwarf will choose it if non-split DWARF is in the object, else the init call will return DW_DLV_NO_ENTRY.

If DW_GROUPNUMBER_DWO is passed in libdwarf will choose it if .dwo sections are in the object, else the init will call return DW_DLV_NO_ENTRY.

If a groupnumber greater than two is passed in libdwarf simply accepts it, whether any sections corresponding to that groupnumber exist or not.

For information on groups "dwarfdump -i" on an object file will show all section group information unless the object file is a simple standard object with no .dwo sections and no COMDAT groups (in which case the output will be silent on groups). Look for Section Groups data in the dwarfdump output. The groups information will be appearing very early in the dwarfdump output.

Sections that are part of an Elf COMDAT GROUP are asigned a group number > 2. There can be many such COMDAT groups in an object file (but none in an executable or shared object). Each such COMDAT group will have a small set of sections in it and each section in such a group will be assigned the same group number by libdwarf.

Sections that are in a .dwp .dwo object file are assigned to DW_GROUPNUMBER_DWO,

Sections not part of a .dwp package file or a.dwo section, or a COMDAT group are assigned DW_GROUPNUMBER_BASE.

At least one compiler relies on relocations to identify COMDAT groups, but the compiler authors do not publicly document how this works so we ignore such (these COMDAT groups will result in libdwarf returning DW_DLV_ERROR).

For information on groups "dwarfdump -i" on an object file will show all section group information unless the object file is a simple standard object with no .dwo sections and no COMDAT groups (in which case the output will be silent on groups). Look for Section Groups data in the dwarfdump output. The groups information will be appearing very early in the dwarfdump output.

Sections that are part of an Elf COMDAT GROUP are asigned a group number > 2. There can be many such COMDAT groups in an object file (but none in an executable or shared object). Each such COMDAT group will have a small set of sections in it and each section in such a group will be assigned the same group number by libdwarf.

Sections that are in a .dwp .dwo object file are assigned to DW_GROUPNUMBER_DWO,

Sections not part of a .dwp package file or a.dwo section, or a COMDAT group are assigned DW_GROUPNUMBER_BASE.

Popular compilers and tools are using such sections. There is no detailed documentation that we can find (so far) on how the COMDAT section groups are used, so libdwarf is based on observations of what compilers generate.

Details on separate DWARF object access

There are, at present, two distinct approaches in use to put DWARF information into separate objects to significantly shrink the size of the executable.

One is Macos dSYM. It's a convention of placing the DWARF-containing object in a subdirectory tree.

The other is GNU debuglink and GNU debug_id. These are two distinct ways to provide names of alternative DWARF-containing objects elsewhere in a file system.

If one initializes a Dwarf_Debug object with dwarf_init_path() or dwarf_init_path_dl() appropriately libdwarf will automatically open the alternate object and report on the DWARF there.

See also
https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html

libdwarf provides means to automatically read the alternate object (in place of the one named in the init call) or to suppress that and read the named object file.

int dwarf_init_path(const char * dw_path,
char * dw_true_path_out_buffer,
unsigned int dw_true_path_bufferlen,
unsigned int dw_groupnumber,
Dwarf_Handler dw_errhand,
Dwarf_Ptr dw_errarg,
Dwarf_Debug* dw_dbg,
Dwarf_Error* dw_error);
int dwarf_init_path_dl(const char *dw_path,
char * true_path_out_buffer,
unsigned true_path_bufferlen,
unsigned groupnumber,
Dwarf_Handler errhand,
Dwarf_Ptr errarg,
Dwarf_Debug * ret_dbg,
char ** dl_path_array,
unsigned int dl_path_count,
unsigned char * path_source,
Dwarf_Error * error);
int dwarf_init_path_dl(const char *dw_path, char *dw_true_path_out_buffer, unsigned int dw_true_path_bufferlen, unsigned int dw_groupnumber, Dwarf_Handler dw_errhand, Dwarf_Ptr dw_errarg, Dwarf_Debug *dw_dbg, char **dw_dl_path_array, unsigned int dw_dl_path_array_size, unsigned char *dw_dl_path_source, Dwarf_Error *dw_error)
Initialization following GNU debuglink section data.

Case 1:

If dw_true_path_out_buffer or dw_true_path_bufferlen are passed in as zero then the library will not look for an alternative object.

Case 2:

If dw_true_path_out_buffer passes a pointer to space you provide and dw_true_path_bufferlen passes in the length, in bytes, of the buffer, libdwarf will look for alternate DWARF-containing objects. We advise that the caller zero all the bytes in dw_true_path_out_buffer before calling.

If the alternate object name (with its null-terminator) is too long to fit in the buffer the call will return DW_DLV_ERROR with dw_error providing error code DW_DLE_PATH_SIZE_TOO_SMALL.

If the alternate object name fits in the buffer libdwarf will open and use that alternate file in the returned Dwarf_Dbg.

It's up to callers to notice that dw_true_path_out_buffer now contains a string and callers will probably wish to do something with the string.

If the initial byte of dw_true_path_out_buffer is a non-null when the call returns then an alternative object was found and opened.

The second function, dwarf_init_path_dl(), is the same as dwarf_init_path() except the _dl version has three additional arguments, as follows:

Pass in NULL or dw_dl_path_array, an array of pointers to strings with alternate GNU debuglink paths you want searched. For most people, passing in NULL suffices.

Pass in dw_dl_path_array_size, the number of elements in dw_dl_path_array.

Pass in dw_dl_path_source as NULL or a pointer to char. If non-null libdwarf will set it to one of three values:

DW_PATHSOURCE_basic which means the original input dw_path is the one opened in dw_dbg.

DW_PATHSOURCE_dsym which means a Macos dSYM object was found and is the one opened in dw_dbg. dw_true_path_out_buffer contains the dSYM object path.

DW_PATHSOURCE_debuglink which means a GNU debuglink or GNU debug-id path was found and names the one opened in dw_dbg. dw_true_path_out_buffer contains the object path.

Suppressing CRC calculation for debuglink

GNU Debuglink-specific issue:

If GNU debuglink is present and considered by dwarf_init_path() or dwarf_init_path_dl() the library may be required to compute a 32bit crc (Cyclic Redundancy Check) on the file found via GNU debuglink.

See also
https://en.wikipedia.org/wiki/Cyclic_redundancy_check

For people doing repeated builds of objects using such the crc check is a waste of time as they know the crc comparison will pass.

For such situations a special interface function lets the dwarf_init_path() or dwarf_init_path_dl() caller suppress the crc check without having any effect on anything else in libdwarf.

It might be used as follows (the same pattern applies to dwarf_init_path_dl() ) for any program that might do multiple dwarf_init_path() or dwarf_init_path_dl() calls in a single program execution.

int res = 0;
int crc_check= 0;
res = dwarf_init_path(..usual arguments);
/* Reset the crc flag to previous value. */
/* Now check res in the usual way. */

This pattern ensures the crc check is suppressed for this single dwarf_init_path() or dwarf_init_path_dl() call while leaving the setting unchanged for further dwarf_init_path() or dwarf_init_path_dl() calls in the running program.

Recent Changes

We list these with newest first.

Changes 0.5.0 to 0.5.1 A memory leak from dwarf_load_loclists() and dwarf_load_rnglists() is fixed and the libdwarf-regressiontests error that hid the leak has also been fixed.

Changes 0.4.2 to 0.5.0 v0.5.0 released 2022-11-22 The handling of the .debug_abbrev data in libdwarf is now more cpu-efficient (measurably faster) so access to DIEs and attribute lists is faster. The changes are library-internal so are not visible in the API.

Corrects CU and TU indexes in the .debug_names (fast access) section to be zero-based. The code for that section was previously unusable as it did not follow the DWARF5 documentation.

dwarf_get_globals() now returns a list of Dwarf_Global names and DIE offsets whether such are defined in the .debug_names or .debug_pubnames section or both. Previously it only read .debug_pubnames.

A new function, dwarf_global_tag_number(), returns the DW_TAG of any Dwarf_Global that was derived from the .debug_names section.

Three new functions enable printing of the .debug_addr table. dwarf_debug_addr_table(), dwarf_debug_addr_by_index(), and dwarf_dealloc_debug_addr_table(). Actual use of the table(s) in .debug_addr is handled for you when an attribute invoking such is encountered (see DW_FORM_addrx, DW_FORM_addrx1 etc).

Added doc/libdwarf.dox to the distribution (left out by accident earlier).

Changes 0.4.1 to 0.4.2 0.4.2 released 2022-09-13. No API changes. No API additions. Corrected a bug in dwarf_tsearchhash.c where a delete request was accidentally assumed in all hash tree searches. It was inivsible to libdwarf uses. Vulnerabilities DW202207-001 and DW202208-001 were fixed so error conditions when reading fuzzed object files can no longer crash libdwarf (the crash was possible but not certain before the fixes). In this release we believe neither libdwarf nor dwarfdump leak memory even when there are malloc failures. Any GNU debuglink or build-id section contents were not being properly freed (if malloced, meaning a compressed section) until 9 September 2022.

It's now possible to run the build sanity tests in all three build mechanisms (configure,cmake,meson) on linux, MacOS, FreeBSD, and mingw msys2 (windows). libdwarf README.md (or README) and README.cmake document how to do builds for each supported platform and build mechanism.

Changes 0.4.0 to 0.4.1 Reading a carefully corrupted DIE with form DW_FORM_ref_sig8 could result in reading memory outside any section, possibly leading to a segmentation violation or other crash. Fixed.

See also
https://www.prevanders.net/dwarfbug.xml DW202206-001

Reading a carefully corrupted .debug_pubnames/.debug_pubtypes could lead to reading memory outside the section being read, possibly leading to a segmentation violation or other crash. Fixed.

See also
https://www.prevanders.net/dwarfbug.xml DW202205-001

libdwarf accepts DW_AT_entry_pc in a compilation unit DIE as a base address for location lists (though it will prefer DW_AT_low_pc if present, per DWARF3). A particular compiler emits DW_AT_entry_pc in a DWARF2 object, requiring this change.

libdwarf adds dwarf_suppress_debuglink_crc() so that library callers can suppress crc calculations. (useful to save the time of crc when building and testing the same thing(s) over and over; it just loses a little checking.) Additionally, libdwarf now properly handles objects with only GNU debug-id or only GNU debuglink.

dwarfdump adds --show-args, an option to print its arguments and version. Without that new option the version and arguments are not shown. The output of -v (--version) is a little more complete.

dwarfdump adds --suppress-debuglink-crc, an option to avoid crc calculations when rebuilding and rerunning tests depending on GNU .note.gnu.buildid or .gnu_debuglink sections. The help text and the dwarfdump.1 man page are more specific documenting --suppress-debuglink-crc and --no-follow-debuglink

Changes 0.3.4 to 0.4.0

Removed the unused Dwarf_Error argument from dwarf_return_empty_pubnames() as the function can only return DW_DLV_OK. dwarf_xu_header_free() renamed to dwarf_dealloc_xu_header(). dwarf_gdbindex_free() renamed to dwarf_dealloc_gdbindex(). dwarf_loc_head_c_dealloc renamed to dwarf_dealloc_loc_head_c().

dwarf_get_location_op_value_d() renamed to dwarf_get_location_op_value_c(), and 3 pointless arguments removed. The dwarf_get_location_op_value_d version and the three arguments were added for DWARF5 in libdwarf-20210528 but the change was a mistake. Now reverted to the previous version.

The .debug_names section interfaces have changed. Added dwarf_dnames_offsets() to provide details of facts useful in problems reading the section. dwarf_dnames_name() now does work and the interface was changed to make it easier to use.

Changes 0.3.3 to 0.3.4

Replaced the groff -mm based libdwarf.pdf with a libdwarf.pdf generated by doxygen and latex.

Added support for the meson build system.

Updated an include in libdwarfp source files. Improved doxygen documentation of libdwarf. Now 'make check -j8' and the like works correctly. Fixed a bug where reading a PE (Windows) object could fail for certain section virtual size values. Added initializers to two uninitialized local variables in dwarfdump source so a compiler warning cannot not kill a –enable-wall build.

Added src/bin/dwarfexample/showsectiongroups.c so it is easy to see what groups are present in an object without all the other dwarfdump output.

Changes 20210528 to 0.3.3 (28 January 2022)

There were major revisions in going from date versioning to Semantic Versioning. Many functions were deleted and various functions changed their list of arguments. Many many filenames changed. Include lists were simplified. Far too much changed to list here.