GHC Weekly News - 2015/07/29
bgamari - 2015-07-29
Hi *,
Welcome for the latest entry in the GHC Weekly News. Today GHC HQ met to discuss plans post-7.10.2.
GHC 7.10.2 release
GHC 7.10.2 has been released!
Feel free to grab a tarball and enjoy! See the release notes for discussion of what has changed.
As always, if you suspect that you have found a regression don’t hesitate to open a Trac ticket. We are especially interested in performance regressions with fairly minimal reproduction cases.
GHC 7.10.2 and the text
package
A few days ago a report came in of long compilations times under 7.10.2 on a program with many Text
literals (#10528). This ended up being due to a change in the simplifier which caused it to perform rule rewrites on the left-hand-side of other rules. While this is questionable (read “buggy”) behavior, it doesn’t typically cause trouble so long as rules are properly annotated with phase control numbers to ensure they are performed in the correct order. Unfortunately, it turns out that the rules provided by the text
package for efficiently handling string literals did not include phase control annotations. This resulted in a rule from base
being performed on the literal rules, which rendered the literal rules ineffective. The simplifier would then expend a great deal of effort trying to simplify the rather complex terms that remained.
Thankfully, the fix is quite straightforward: ensure that the the text literal rules fire in the first simplifier phase (phase 2). This avoids interference from the base
rules, allowing them to fire as expected.
This fix is now present in text-1.2.1.2
. Users of GHC 7.10.2 should be use this release if at all possible. Thanks to text
’s maintainer, Bryan O’Sullivan for taking time out of his vacation to help me get this new release out.
While this mis-behaviour was triggered by a bug in GHC, a similar outcome could have arisen even without this bug. This highlights the importance of including phase control annotations on INLINE
and RULE
pragmas: Without them the compiler may choose the rewrite in an order that you did not anticipate. This has also drawn attention to a few shortcomings in the current rewrite rule mechanism, which lacks the expressiveness to encode complex ordering relationships between rules. This limitation pops up in a number of places, including when trying to write rules on class-overloaded functions. Simon Peyton Jones is currently pondering possible solutions to this on #10595.
StrictData
This week we merged the long-anticipated -XStrictData
extension (Phab:D1033) by Adam Sandberg Ericsson. This implements a subset of the [StrictPragma] proposal initiated by Johan Tibell.In particular, StrictData
allows a user to specify that datatype fields should be strict-by-default on a per-module basis, greatly reducing the syntactic noise introduced by this common pattern. In addition to implementing a useful feature, the patch ended up being a nice clean-up of the GHC’s handling of strictness annotations.
What remains of this proposal is the more strong -XStrict
extension which essentially makes all bindings strict-by-default. Adam has indicated that he may take up this work later this summer.
$ AMP-related performance regression
In late May Herbert Valerio Riedel opened Phab:D924, which removed an explicit definition for mapM
in the []
Traversable
instance, as well as redefined mapM_
in terms of traverse_
to bring consistency with the post-AMP world. The patch remains unmerged, however, due to a failing ghci testcase. It turns out the regression is due to the redefinition of mapM_
, which uses (*>)
where (>>)
was once used. This tickles poor behavior in ghci’s ByteCodeAsm
module. The problem can be resolved by defining (*>) = (>>)
in the Applicative Assembler
instance (e.g. Phab:1097). That being said, the fact that this change has already exposed performance regressions raises doubts as to whether it is prudent.
GHC Performance work
Over the last month or so I have been working on nailing down a variety of performance issues in GHC and the code it produces. This has resulted in a number of patches which in some cases dramatically improve compilation time (namely Phab:1012 and Phab:D1041). Now since 7.10.2 is out I’ll again be spending most of my time on these issues. We have heard a number of reports that GHC 7.10 has regressed on real-world programs. If you have a reproducible performance regression that you would like to see addressed please open a Trac ticket.
Merged patches
- Phab:D1028: Fixity declarations are now allowed for infix data constructors in GHCi (thanks to Thomas Miedema)
- Phab:D1061: Fix a long-standing correctness issue arising when pattern matching on floating point values
- Phab:D1085: Allow programs to run in environments lacking iconv (thanks to Reid Barton)
- Phab:D1094: Improve code generation in
integer-gmp
(thanks to Reid Barton) - Phab:D1068: Implement support for the
MO_U_Mul2
MachOp
in the LLVM backend (thanks to Michael Terepeta) - Phab:D524: Improve runtime system allocator performance with two-step allocation (thanks to Simon Marlow)
That’s all for this time. Enjoy your week!
Cheers,
- Ben