Best practices for foreign imports
bgamari - 2021-07-12
tl;dr: When importing system libraries we strongly recommend that users use
GHC’s capi
calling convention. For details, see the recommendations
section below.
One of Haskell’s strengths is its great foreign function interface: using
time-tested foreign libraries or raw system calls is just a foreign import
away. However, while syntactically simple, safely using foreign functions can be
quite tricky. A few weeks ago we saw one facet of this problem in the
keepAlive#
post.
This week we will look at another complexity which has recently caused us
trouble: calling conventions.
Why this matters
With the increasing prevalance of ARM hardware with Apple’s recent releases, many latent bugs due to calling convention details are becoming more visible.
For instance, in #20079 it was noticed that GHCi crashes on
AArch64/Darwin when the terminal window is resized. We eventually found that
this was due to a bug in haskeline
: ioctl
, a variadic
function, was imported using GHC’s ccall
calling convention.
The fix is straightforward: use the capi
pseudo-calling
convention introduced in GHC 7.6.1.
It turns out that incorrect ioctl
imports is rather common
pattern among
Hackage packages. Consequently, we thought it would be helpful to offer some
explicit guidance for users.
Background: Foreign calling conventions
During a function call both the caller and the callee must agree on several operational details:
- when the function is called:
- which arguments can be passed in registers?
- in what order are the remaining arguments pushed to the stack?
- how are variadic functions handled?
- must the stack be aligned?
- where is the return address found?
- when the function returns:
- who is responsible for popping the arguments from the stack?
- where is the return value(s) stored?
Together, these details are known as a calling convention and are typically implied by the operating system and target architecture. For instance, x86-64 Linux (and most other POSIX platforms) typically uses the System V amd64 calling convention whereas 32-bit Windows has no fewer than three commonly-used conventions.
When compiling C source, the C compiler determines a function’s
calling convention using its signature, which typically appears in a header
file. However, when GHC imports a function with the usual ccall
calling convention, e.g.:
import ccall "hello_world" helloWorld :: IO () foreign
it does not have the benefit of a signature; instead it must infer the calling convention from the type given by the import. This can break in two ways:
- many calling conventions treat variadic functions (e.g.
printf
) differently from the corresponding non-variadic signature; while it is documented thatccall
does not support variadic functions, this fact is not well-known by users. - the type provided by the user may be wrong (e.g. using
Int
instead ofCInt
)
Unfortunately, with the foreign import ccall
mechanism the compiler has no
way of catching such issues, potentially leaving the user with
difficult-to-spot, platform-dependent soundness bugs.
Safe foreign calls via CApiFFI
To address help mitigate this class of bugs, GHC 7.10 introduced a new language
extension, CApiFFI
, which offers a more robust way to import foreign
functions. Unlike ccall
, capi
requires that the user specify both the
foreign function’s name as well as the name of the header file where its
signature can be found. For instance, one can write:
import capi "stdio.h puts" c_puts :: Ptr CChar -> IO CInt foreign
To compile this, GHC will construct a C source file which #include
’s
stdio.h
. and defines a stub function which performs the call:
#include "stdio.h"
(void* a1) {
HsInt32 ghczuwrapperZC0ZCmainZCHelloZCputsreturn puts(a1);
}
This approach brings a few advantages:
capi
imports can be used to import functions defined using CPP- the calling convention is decided by the C compiler using the signature provided in the indicated header file, eliminating the potential for inconsistency
- variadic functions “just work”
- it removes the need to worry about which of Windows’ zoo of supported conventions is used (see #12890, #3052)
Recommendations for users
As a rule, the easiest code to debug is the code that you don’t need to write.
Consequently, users are encouraged to use existing bindings libraries (e.g.
unix
) instead of defining their own foreign imports when possible.
Of course, not all libraries have bindings available. In these cases we
recommend that users use foreign import capi
for imports of libraries not under
their control (e.g. system libraries).
Note, however, that capi
does incur a small (arguably negligible) runtime
cost due to the to the C stub. It is justifiable to use ccall
to avoid this
runtime cost in cases where the foreign function is shipped with a package’s
cbits
, where the calling convention is clear.