DWARF support in GHC (part 1)
Ben Gamari - 2020-04-03
This post is the first of a series examining GHC’s support for DWARF debug information and the tooling that this support enables:
- Part 1 introduces DWARF debugging information and explains how its generation can be enabled in GHC.
- Part 2 looks at a DWARF-enabled program in
gdb
and examines some of the limitations of this style of debug information. - Part 3 looks at the backtrace support of GHC’s runtime system and how it can be used from Haskell.
- Part 4 examines how the Linux
perf
utility can be used on GHC-compiled programs. - Part 5 concludes the series by describing future work, related projects, and ways in which you can help.
DWARF debugging information
For several years now GHC has had support for producing DWARF debugging information. DWARF is a widely-used format (used by Linux and several BSDs) for representing debug information (typically embedded in an executable) for consumption by runtime systems, profiling, and debugging tools. It allows representation of a variety of information:
line information mapping instructions back to their location in the source program (e.g. the instruction at address
x
originated frommyprogram.c
line 42).unwind information allowing call chains to be reconstructed from the runtime state of the execution stack (e.g. the program is currently executing
f
, which was called fromg
, which was called fromh
, …)type information, allowing debugging tools to reconstruct the structure and identity of values from the runtime state of the program (e.g. when the program is executing the instruction at address
x
, the value sitting in the$rax
register is a pointer to aFoobar
object.
Collectively, this information is what allows debuggers (e.g. gdb
) and
profiling tools (e.g. perf
) to do what they do.
The effort to add DWARF support to GHC started with Peter Wortmann’s dissertation work which introduced the ability for GHC to emit basic line and unwind information in its executables. This support has matured considerably over the past few years and should finally be ready for use with GHC 8.10.
There are a few potential use-cases for DWARF information:
Use in native debugging tools (e.g.
gdb
)Dumping runtime call stacks to the console using the
SIGQUIT
signal; this is particularly useful in productionComputing runtime call stacks from within the program (using the
GHC.ExecutionStack
interface inbase
)Statistical profiling using tools like perf.
Capturing call-stacks in exceptions for reporting to the user
We will discuss all of these in this series of blog posts. The rest of this first post will examine how to compile a DWARF-enabled binary.
First steps
As of GHC 8.10.2, GHC HQ will provide DWARF-enabled binary distributions for Debian 9, Debian 10, and Fedora 27 (as of 8.10.1 only Debian 9 is provided). These binary distributions differ in two respects from the non-DWARF distributions:
- all provided libraries (e.g.
base
,filepath
,unix
, etc.) are built with debug information. - the runtime system is built with a dependency on the
libdw
library (provided by theelfutils
package).
Like other compilers, debug information support under GHC is enabled
with the -g
flag. This flag can be passed a numeric “debug level”,
which determines the detail (and, consequently, size) of the debug
information that is produced. These levels are described in the GHC
user
guide.
When using native debug information we must keep in mind that all code linked into an executable (e.g. native libraries, Haskell libraries, and the code of the executable itself) must be built with debug information. Failure to ensure this will result in truncated backtraces.
To build a package with native debug information we can use
cabal-install
’s --enable-debug-info
flag (or, below, its equivalent
key in cabal.project
). Here, we will use the vector
testsuite as a
non-trivial example:
$ git clone https://github.com/haskell/vector
$ cd vector
$ cat >>cabal.project.local <<EOF
allow-newer: base
package vector
tests: True
package *
debug-info: 2
EOF
$ cabal new-build vector-tests-O0
For the sake of demonstration we built the vector-tests-O0
testsuite
(which builds vector
’s tests without optimisation) since this provides
slightly more interesting stacktraces. We chose debug level 2 as we will
not be using the GHC-specific debug information emitted by debug level
3.
At this point we have a DWARF-annotated binary. This binary is functionally identical to a non-annotated build (apart from containing quite a few more bits, weighing in at over 150 megabytes). Most importantly, no optimizations were inhibited by enabling debug information.
In the next post we will begin to see what this extra 100 megabytes of debug information gives us.