DWARF support in GHC (part 3)
Ben Gamari - 2020-04-05
This post is the third of a series examining GHC’s support for DWARF debug information and the tooling that this support enables:
- Part 1 introduces DWARF debugging information and explains how its generation can be enabled in GHC.
- Part 2 looks at a DWARF-enabled program in
gdb
and examines some of the limitations of this style of debug information. - Part 3 looks at the backtrace support of GHC’s runtime system and how it can be used from Haskell.
- Part 4 examines how the Linux
perf
utility can be used on GHC-compiled programs. - Part 5 concludes the series by describing future work, related projects, and ways in which you can help.
Getting backtraces from the runtime
We saw in the last post that GHC’s debug
information can be used by the gdb
interactive debugger to provide
meaningful backtraces of running Haskell programs. However, debuggers
are not the only consumer of these backtraces. For several releases now
the GHC RTS has itself supported stack backtraces. This support can be
invoked in two ways:
- via the
SIGQUIT
signal - via the
GHC.ExecutionStack
interface inbase
In the first case, programs built with debug symbols and a
libdw
-enabled compiler can be sent the SIGQUIT
signal 1,
resulting in a stack trace being blurted to stderr
:
$ vector-tests-O0 >/dev/null & sleep 0.2; kill -QUIT %1
Caught SIGQUIT; Backtrace:
0x1387442 set_initial_registers (rts/Libdw.c:288.0)
0x7feecbf2b0e0 dwfl_thread_getframes (/nix/store/35vnzk39hwsx18d1bkcd30r5xrx026mr-elfutils-0.176/lib/libdw-0.176.so)
0x7feecbf2ab4e get_one_thread_cb (/nix/store/35vnzk39hwsx18d1bkcd30r5xrx026mr-elfutils-0.176/lib/libdw-0.176.so)
0x7feecbf2aea3 dwfl_getthreads (/nix/store/35vnzk39hwsx18d1bkcd30r5xrx026mr-elfutils-0.176/lib/libdw-0.176.so)
0x7feecbf2b479 dwfl_getthread_frames (/nix/store/35vnzk39hwsx18d1bkcd30r5xrx026mr-elfutils-0.176/lib/libdw-0.176.so)
0x1387abd libdwGetBacktrace (rts/Libdw.c:259.0)
0x1373b26 backtrace_handler (rts/posix/Signals.c:534.0)
0x7feecbf7185f (null) (/nix/store/g2p6fwjc995jrq3d8vph7k45l9zhdf8f-glibc-2.27/lib/libpthread-2.27.so)
0x137f24f _rts_stgzuapzup_ret (_build/stage1/rts/build/cmm/AutoApply.cmm:654.18)
0x137de10 stg_upd_frame_info (rts/Updates.cmm:31.1)
0xa7cc80 _randomzm1zi1zmc60864d5616c60090371cdf8e600240f388e8a9bd87aa769d8045bda89826ee2_SystemziRandom_lvl6_siHP_entry (System/Random.hs:489.70)
0x12c52d8 integerzmwiredzmin_GHCziIntegerziType_minusInteger_info (libraries/integer-gmp/src/GHC/Integer/Type.hs:437.1)
0xa7d098 randomzm1zi1zmc60864d5616c60090371cdf8e600240f388e8a9bd87aa769d8045bda89826ee2_SystemziRandom_zdwrandomIvalInteger_info (System/Random.hs:487.20)
0x98da50 _QuickCheckzm2zi13zi2zmac90a2a0d9e0dd2c227d795a9d4d9de22a119c3781b679f3b245300e1b658c43_TestziQuickCheckziArbitrary_sat_sx8e_entry (Test/QuickCheck/Arbitrary.hs:988.26)
0x137de10 stg_upd_frame_info (rts/Updates.cmm:31.1)
...
0x137c810 stg_stop_thread_info (rts/StgStartup.cmm:42.1)
0x136571b StgRunJmp (rts/StgCRun.c:370.0)
0x136241b scheduleWaitThread (rts/Capability.h:219.0)
0x135f35e hs_main (rts/RtsMain.c:73.0)
0x455df4 (null) (/opt/exp/ghc/ghc-8.10/vector/dist-newstyle/build/x86_64-linux/ghc-8.10.0.20191231/vector-0.13.0.1/t/vector-tests-O0/build/vector-tests-O0/vector-tests-O0)
0x7feecbd4ab8e __libc_start_main (/nix/store/g2p6fwjc995jrq3d8vph7k45l9zhdf8f-glibc-2.27/lib/libc-2.27.so)
0x40a82a _start (../sysdeps/x86_64/start.S:122.0)
This can be especially useful in diagnosing unexpected CPU usage or latency in long-running tasks (e.g. a server stuck in a loop).
Note, however, that this currently only provides a backtrace of the program’s main capability. Backtrace support for multiple capabilities is an outstanding task.
Getting backtraces from Haskell
The runtime’s unwinding support can also be invoked from Haskell programs via the GHC.ExecutionStack interface. This provides:
-- | A source location.
data Location = {- ... -}
-- | Returns a stack trace of the calling thread or 'Nothing'
-- if the runtime system lacks libdw support.
getStackTrace :: IO (Maybe [Location])
In the future we would also like to provide
getThreadStackTrace :: ThreadId -> IO (Maybe [Location])
although this is an outstanding task.
This could be used in a number of ways:
when throwing an exception, one could capture the current stack for use in diagnostics output.
with
getThreadStackTrace
a monitoring library likeekg
might provide the ability to enumerate the program’s threads and introspect on what they are up to.
We’ll look at (1) in greater detail below.
Providing backtraces for exceptions
Attaching backtrace information to exceptions is fairly straightforward. For instance, one could provide
data WithStack e = WithStack (Maybe [Location]) e
instance Exception (WithStack e)
throwIOWithStack :: e -> IO a
= do
throwIOWithStack exc <- getStackTrace
stack $ WithStack stack exc
throwM
throwWithStack :: e -> a
= unsafePerformIO . throwIOWithStack
throwWithStack
-- | Attach a stack trace to any exception thrown by the enclosed action.
-- Note that this is idempotent.
addStack :: IO a -> IO a
= handle f
addStack where
f :: SomeException -> IO b
| WithStack{} <- fromException exc =
f exc -- ensure idempotency
throwIO exc SomeException exc) =
f ( throwIOWithStack exc
Keep in mind that DWARF stack unwinding can incur a significant overhead
(being linear in the depth of the stack with a significant constant
factor). Consequently, it would be unwise to use throwIOWithStack
indiscriminantly (e.g. when throwing an asynchronous exception to kill
another thread). However, for truly “exceptional” cases (e.g. failing
due to a non-existent file), it would offer quite some value.
Unfortunately, the untyped nature of Haskell exceptions complicates the
migration path for existing code. Specifically, if a library provides a
function which throws MyException
, users catching MyException
would
break if the library started throwing WithStack MyException
. While
this may be manageable in the case of user libraries, for packages at
the heart of the Haskell ecosystem (e.g. base
) this is a significant
hurdle.
Another design which avoids this migration problem is to incorporate
backtraces directly into the base
SomeException
type, which is used
to represent all thrown exceptions. Specifically, Control.Exception
could then expose a variety of throwing functions, reflecting the many
call stack mechanisms GHC now offers:
data SomeException where
SomeException :: forall e. Exception e
=> Maybe [Location] -- ^ backtrace, if available
-> e -- ^ the exception
-> SomeException
-- | A representation of source locations consolidating 'GHC.Stack.SrcLoc',
-- 'GHC.Stack.CostCentre', and 'GHC.ExecutionStack.Location'.
data Location = {- ... -}
-- | Throws an exception with no stack trace.
throwIO :: e -> IO a
-- | Throws an exception with a stack trace captured via
-- 'GHC.Stack.getStackTrace'.
throwIOWithExecutionStack :: e -> IO a
-- | Throws an exception with a `HasCallStack` stack trace.
throwIOWithCallStack :: HasCallStack => e -> IO a
Of course, this raises the question of which call-stack method a particular exception ought to use. This is often unknowable, depending upon the user’s build configuration. Consequently, we might consider exposing something of the form:
-- | Throws an exception with a stack trace using the most
-- precise method available in the current build configuration.
throwIOWithStack :: HasCallStack => e -> IO a
throwIOWithStack| profiling_enabled = throwIOWithCostCentreStack
| dwarf_enabled = throwIOWithExecutionStack
| otherwise = throwIOWithCallStack
Finally, new “catch” operations could be introduced providing the handler access to the exception’s stack:
catchWithLocation :: IO a -> (e -> Maybe [Location] -> IO a) -> IO a
Above are just two possible designs; I’m sure there are other points worthy of exploration. Do let me know if you are interesting in picking up this line of work.
The next post will look at using the Linux
perf
utility to profile Haskell executables.
The unfortunate choice of the
SIGQUIT
signal to dump a backtrace originates from the Java virtual machine implementations, where this has been long available. GHC currently follows this precedent although some people believe thatSIGQUIT
should be used for… quitting. Do let us know on #17451 if you feel should we should reconsider the choice to follow Java on this point.↩︎