Reflecting on Haskell in 2017
Alas, another year has come and gone. It feels like just yesterday I was writing
the last reflection blog post on my flight back to Boston for Christmas. I've
spent most of the last year traveling and working in Europe, meeting a lot of
new Haskellers and putting a lot of faces to names.
Haskell has had a great year and 2017 was defined by vast quantities of new
code, including 14,000 new Haskell projects on Github .
The amount of writing this year was voluminous and my list of interesting work
is eight times as large as last year. At least seven new companies came into
existence and many existing firms unexpectedly dropped large open source Haskell
projects into the public sphere. Driven by a lot of software catastrophes, the
intersection of security, software correctness and formal methods have been
become quite an active area of investment and research across both industry and
academia. It's really never been an easier and more exciting time to be
programming professionally in the world's most advanced (yet usable) statically
typed language.
Per what I guess is now a tradition, I will write my end of year retrospective
on my highlights of what happened in the Haskell scene in retrospect.
Writing
The amount of Haskell writing this year was vast and it's impossible to
enumerate all of the excellent writing that occurred. Some of the truly
exceptional bits that were drafted in 2017 included:
- Old Graphs From New Types by Andrey Mokhov
- Type Tac Toe by Chris Penner
- Typing the Technical Interview by Kyle Kingsbury
- Compiling to Categories by Conal Elliott
- Inlining and Specialisation by Matthew Pickering
- Counterexamples of Typeclasses by Phil Freeman
- An Informal Guide to Better Compiler Errors by Jasper Van der Jeugt
- A Tutorial on Higher Order Unification by Danny Gratzer
- On Competing with C Using Haskell by Chris (kqr)
- Making Movie Monad by David Lettier
- Avoid Overlapping Instances With Closed Type Families by Kwang Yul Seo
- Writing a Formally Verified Browser in Haskell by Michael Burge
- GHC Generics Explained by Mark Karpov
- REST API in Haskell by Maciej Spychała
- I Haskell a Git by Vaibhav Sagar
- Why prove programs equivalent when your compiler can do that for you? by Joachim Breitner
- Free Monads Considered Harmful by Mark Karpov
- Beautiful Aggregations with Haskell by Evan Borden
- Submitting Haskell Functions to Z3 by John Wiegley
Books & Courses
Vladislav Zavialov and Artyom Kazak set out to write a book on the
netherlands of Intermediate Haskell development, a mythical place that we all
seemingly pass through but never speak of again. The project is intuitively called
Intermediate Haskell and is slated to be
released in 2018
Bartosz Milewski finished writing Category Theory for
Programmers
which is freely available and also generated as
PDF.
Brent Yorgey drafted a new course for teaching introduction to Computer Science
using Haskell at Hendrix
University. Dmitry Kovanikov and
Arseniy Seroka also drafted a course for a wide variety of intermediate to
advanced Haskell topics at ITMO
university. Some of which are in Russian but nevertheless большое письмо!
GHC
The Glorious Glasgow Haskell Compiler had it's 8.2 release, and landed to much
rejoicing. Major features such as unboxed sum types landed as planned in GHC
8.2. There were many longstanding issues that were closed and many miscellaneous
fixes contributed in 2017. For instance GHC now uses alternative linkers such as
ld.gold
or ld.lld
instead of the system default ld.
Semigroup is now a superclass of Monoid.
There was quite a bit of work on GHC cross
compilation to ARM. The new build system
Hadrian has been in work for the past
three years, has was finally been merged into the GHC tree.
The DevOps Group has officially started and is being funded to help maintain the
infrastructure used to host Haskell packages and build GHC. The general push of
the group has been toward using hosted CI services, Appveyor and CircleCI and a
greater use of more transparent platforms such as Github for GHC development.
There is work on a major refactor of the AST types to use the new Trees that
Grow research
to allow GHC API user to extend the AST for their own purposes. Eventually this
may allow the split of the AST types out of the ghc package, allowing tooling
authors, Template Haskell users, and the compiler itself to use the same AST
representation.
GHC is partially accepting pull requests on
Github although most of the development
still occurs on the mailing list and Phabricator.
There was significant engineering effort to allow GHC to produce deterministic
build artifacts to
play nicely with caching reproducible build systems such as Buck and Bazel.
Previously the emitted .hi
files would contain non-deterministic data such
as hashes, timestamps and unique name supply counts.
Errors
GHC 8.2 added wonderful new colorful error messages with caret diagnostics for
syntax and type errors:
Compact Regions
Support for ‘Compact Regions’ landed in GHC 8.2. Compact regions are manually
allocated regions where the data allocated inside it are compacted and not
traversed by the GC. This is amenable for long-lived data structures that are
resident in memory without mutation frequently occurring.
The interface can be accessed through the
ghc-compact modules and used
to create compact malloc'd memory.
import Control.DeepSeq
import Control.Exception
import qualified Data.ByteString.Char8 as B
import Data.Compact
main = compact (B.pack ['a'..'c'])
The test suite
inside GHC maintains the best illustrations of its use for complex non-traditional data structures.
Type.Reflection
GHC added a new more expressive Typeable mechanism using the
Type.Reflection
module. typeRep
can be applied with explicit type applications to arrows and
functions can checked for type-level equality after application.
λ> typeRep @(->)
(->) 'LiftedRep 'LiftedRep
λ> typeRep @(Maybe Int) == App (typeRep @Maybe) (typeRep @Int)
True
Coercible
Not new to GHC 8.2 although the base library now exposes the Coercible
constraints allowing polymorphism over types which have the same runtime
representations to be safely coerced at runtime. This can also be extended to
polymorphic functions which take a compile-time proof of the equivalence of two
runtime data layouts.
import Data.Coerce
newtype Meters = Meters Integer deriving (Num)
newtype Feet = Feet Integer deriving (Num)
f :: (Coercible a b, Num b) => a -> b -> b
f x y = coerce x + y
λ> f (Meters 3) 4
7
λ> f (Feet 3) 4
7
λ> f 4 (Feet 3)
Feet 7
GHC tracks the usage (i.e. role) of type variables used as parameters as either
nominal
representational
or phantom
allowing types that differ only in a
phantom type of nominal parameters to be safely coerced.
Join Points
Luke Maurer and Simon Peyton Jones merged new work on join
points which modifies GHC
Core to optimize for join points in code. In Core, a join point is a specially
tagged function whose only occurrences are saturated tail calls. In the actual
GHC Core AST, a join point is simple bit of metadata indicated by IdDetails
of
the binder.
Simon Peyton Jones presented the keynote at Haskell Exchange on his
collaboration on Compiling without
Continuation
which present the ideas and core optimizations that are allowed by the new join
points.
Deriving Strategies
In GHC 8.0 there were two alternative methods for automatic deriving of
typeclass instances, using GeneralizedNewtypeDeriving
and DeriveAnyClass
. In
addition there was also the wired-in deriving mechanisms for Show
, Read
, etc
that were hardcoded into the compiler. These all used the same syntax and would
conflict if multiple pragmas were enabled in a module.
The addition of DerivingStrategies
allows us to disambiguate which deriving
mechanism to use for a specific instance generation.
newtype Meters = Meters Int
deriving stock (Read, Show)
deriving newtype (Num)
deriving anyclass (ToJSON, FromJSON)
Backpack
Edward Yang finished his PhD thesis on Backpack which was integrated in the GHC
tree. The new branch adds support .bkp files
, which specify abstract
interfaces which can be instantiated in modules and used to construct Haskell
modules which work polymorphically across multiple module instantiations.
For example an abstract string type can be written which operates over a
module parameter `Str``:
unit abstract-str where
signature Str where
data Str
len :: Str -> Int
module AStr (alen) where
import Str
alen :: Str -> Int
alen = len
We can create (contrived) instantiations of this module for lists of ints and
chars which expose a polymorphic length function over both.
unit str-string where
module Str where
type Str = String
len :: Str -> Int
len = length
unit str-list where
module Str where
type Str = [Int]
len :: Str -> Int
len = length
The modules can then be opened as specific namespaces with the exported functions
able to be called over both module types.
unit main where
dependency abstract-str[Str=str-string:Str] (AStr as AStr.Int)
dependency abstract-str[Str=str-list:Str] (AStr as AStr.String)
module Main (main) where
import qualified AStr.Int
import qualified AStr.String
main :: IO ()
main = do
print $ AbstractStr.String.alen "Hello world"
print $ AbstractStr.Int.alen [1, 2, 3]
With the latest GHC this can be compiled with the --backpack
which generates
the sum of all the hi
files specified in the .bkp
file and resolves
values at link-time.
$ stack exec -- ghc --backpack example.bkp
While the functionality exists today, I'm not aware of any large projects using
Backpack. Bolting this functionality onto an ecosystem that has spent a decade
routing around many of the problems this system aims to solve, poses a huge
engineering cost and may take a while to crystallize.
Summer of Haskell
Google lacked vision this year and did not sponsor the Haskell Organization for
Summer of Code. But the program proceeded regardless with private sponsorship
from industrial users. Fifteen students were paired with mentors and many
successful projects and collaborations
resulted.
LLVM
The LLVM bindings for Haskell saw quite a bit of work this year and were forked
into a new organization llvm-hs and added
support for LLVM 4.0 - 5.1:
In a collaboration with Joachim Breitner and myself at Zurihac a type-safe LLVM
library which embeds the semantics of LLVM instructions into the Haskell
type-system was written.
Siddharth Bhat started work on an STG to LLVM backend
simplehxc before doing a
rewrite in C++.
Moritz Angermann has been continuing to develop a Haskell library for emitting
and reading LLVM Bitcode format
as well as work on the llvm-ng
backend which is a major rewrite of the GHC LLVM code generator.
Linear Types
Arnaud Spiwack prototyped a extension of
GHC which augments the type system with
linear types. Edsko de
Vries wrote a detailed blog
post about the nature of linearity and its uses.
The proposal extends the typing of functions to include linearity constraints on
arrows, enforcing that variables or references are created and consumed with
constrained reference counts. This allows us to statically enforce reference
borrowing and allocations in the typechecker potentially allowing us to enforce
lifetime constraints on closures and eliminating long-lived memory from being
used with constructed unbounded lifetimes, thereby eliminating garbage
collection for some Haskell functions.
For instance, use of the linear arrow (a ->. b)
can enrich the existing raw
memory access functions enforcing the matching of allocation and free commands
statically. The multiplicity of usage is either 0
, 1
or ω
and the
linear arrow is syntatic sugar for unit multiplicity are aliases for (:'1 ->)
.
malloc :: Storable a => a ->. (Ptr a ->. Unrestricted b) ->. Unrestricted b
read :: Storable a => Ptr a ->. (Ptr a, a)
free :: Ptr a ->. ()
New abstractions such as movable, consumable and dupable references can be
constructed out of existing class hierarchies and enriched with static linearity
checks:
class Consumable a where
consume :: a ->. ()
class Consumable a => Dupable a where
dup :: a ->. (a, a)
class Dupable a => Movable a where
move :: a ->. Unrestricted a
instance Dupable a => Dupable [a] where
dup [] = ([], [])
dup (a:l) = shuffle (dup a) (dup l)
where
shuffle :: (a, a) ->. ([a], [a]) ->. ([a], [a])
shuffle (a, a') (l, l') = (a:l, a':l')
instance Consumable a => Consumable [a] where
consume [] = ()
consume (a:l) = consume a `lseq` consume l
As part of the Summer of Haskell Edvard Hübinette used linear types to construct
a safer stream processing library which can enforce resource consumption
statically using linearity.
There is considerable debate about the
proposal and the nature
of integration of linear types into GHC. With some community involvement these
patches could be integrated quite quickly in GHC.
LiquidHaskell
LiquidHaskell the large
suite of tools for adding refinement types to GHC Haskell continued development
and became considerably more polished. At HaskellExchange several companies were
using it in anger in production. For example, we can enforce statically the
lists given at compile-time statically cannot contain certain values by
constructing a proposition function in a (subset) of Haskell which can refine
other definitions:
measure hasZero :: [Int] -> Prop
hasZero [] = false
hasZero (x:xs) = x == 0 || (hasZero xs)
type HasZero = {v : [Int] | (hasZero v)}
-- Rejected
xs :: HasZero
xs = [1,2,3,4]
-- Accepted
ys :: HasZero
ys = [0,1,2,3]
This can be used to statically
enforce
that logic that consumes only a Just value can provably only be called with a
Just
with a isJust
measure:
measure isJust :: forall a. Data.Maybe.Maybe a -> Bool
isJust (Data.Maybe.Just x) = true
isJust (Data.Maybe.Nothing) = false
This year saw the addition of inductive predicates, allowing more complex
properties about non-arithmetic refinements to be checked. Including properties
about lists:
measure len :: [a] -> Int
len [] = 0
len (x:xs) = 1 + (len xs)
-- Spec for Data.List exports refined types which can be statically refined with
-- length constraints.
[] :: {v:[a]| len v = 0}
(:) :: _ -> xs:_ -> {v:[a]| len v = 1 + len xs}
append :: xs:[a] -> ys:[a] -> {v:[a]| len v = len xs + len ys}
The full library of specifications is now quite extensive and adding
LiquidHaskell
to an existing codebase is pretty seamless.
Foundation
Foundation is an alternative Prelude informed by modern design practices and
data structures. It ships a much more sensible and efficient packed array of
UTF8 points as its default String
type. Rethinks the Num
numerical tower , and statically distinguishes partial functions.
Also has fledgling documentation beyond just
Last year Foundation was a bit early, but this year at Zurihac several companies
in London reported using it fully in production as a full industrial focused
Prelude.
Editor Tooling
Editor integration improved, adding best in modern tooling to most of the
common editors:
+----------------------+-----------------------------------------------------------------------+
| Editor | Haskell Integration |
+======================+=======================================================================+
| Atom | https://atom.io/packages/ide-haskell |
+----------------------+-----------------------------------------------------------------------+
| Emacs | https://commercialhaskell.github.io/intero/ |
+----------------------+-----------------------------------------------------------------------+
| IntelliJ | https://plugins.jetbrains.com/plugin/8258-intellij-haskell |
+----------------------+-----------------------------------------------------------------------+
| VSCode | https://marketplace.visualstudio.com/items?itemName=Vans.haskero |
+----------------------+-----------------------------------------------------------------------+
| Sublime | https://packagecontrol.io/packages/SublimeHaskell |
+----------------------+-----------------------------------------------------------------------+
Monica Lent wrote a lovely post on the state of art in Vim and Haskell
integration.
Rik van der Kleij has done an impressive amount of work adapting the IntellJ IDE
to work with Haskell.
Including a custom parser handling all of the syntax extensions, name lookup,
Intero integration and integration with haskell-tools refactoring framework.
Development on the
haskell-ide-engine has picked
up again in the last few months.
Formal Methods
The DeepSpec, a collaboration between
MIT, UPenn, Princeton and Yale, is working on a network of specification that
span many compilers, languages and intermediate representations with the goal of
achieving full functional correctness of software and hardware. Both Haskell and
LLVM are part of this network of specifications. The group has successfully
written a new formal calculus describing the GHC core language and proved it
type sound in Coq. The project is called
corespec and is described in the paper
"A Specification for Dependent Types in Haskell".
In addition the group also published a paper “Total Haskell is Reasonable Coq”
and provided a utlity hs-to-coq which
converts haskell code to equivalent Coq code.
The Galois group started formalizing the
semantics of Cryptol, a
compiler for high-assurance cryptographic protocols which is itself
written in Haskell.
Michael Burge wrote a cheeky article about extracting a specification for a
"domain specific"
browser
from Coq into Haskell.
Pragma Proliferation & Prelude
Writing Haskell is almost trivial in practice. You just start with the magic
fifty line {-# LANGUAGE ... #-}
incantation to fast-forward to 2017,
then add 150 libraries that you've blessed by trial and error to your cabal
file, and then just write down a single type signature whose inhabitant is
Refl
to your program. In fact if your program is longer than your import list
you're clearly doing Haskell all wrong.
In all seriousness, Haskell is not the small language it once was in 1998, it's
reach spans many industries, hobbyists, academia and many classes of people with
different incentives and whose ideas about the future of the language are
mutually incompatible. Realistically the reason why the Prelude and extension
situation aren't going to change anytime soon is that no one person or company
has the economic means to champion such a change. It would be enormously
expensive and any solution will never satisfy everyone's concerns and desires.
Consensus is expensive, while making everything opt-in is relatively cheap.
This is ultimately the equilibrium we've converged on and barring some large sea
change the language is going to remain in this equilibrium.
Haskell Survey
Taylor Fausak conducted an unofficial
survey
of Haskell users with some surprising results about widespread use of Haskell.
Surprisingly there are reportedly 100 or more people who maintain 100,000 or
more lines of Haskell code. Not so surprisingly most people have migrated to
Stack while vim and emacs are the editors of choice.
While the majority of respondents are satisfied with Haskell the language the
response are somewhat mixed the quality of libraries and the bulk of respondents
reported Haskell libraries being undocumented , hard to use hard
to find, and don't integrate well.
Projects
Idris, the experimental dependently typed language, reached 1.0 release and
became one of the larger languages which is itself written in Haskell.
The most prolific Haskell library Pandoc released its version 2.0.
Several other groups published new compilers in the Haskell-family of languages.
Intel finally open sourced the Intell Haskell
compiler which was a research project in
more optimal compilation techniques. Morgan Stanley also released
Hobbes a Haskell-like language used
internally at the bank featuring several novel extensions to row-types and C++
FFI integration. A prototype Haskell compiler was also written in
Rust.
The SMT solver integration library SBV saw a major rewrite of its internal and
its old tactics
system.
The library is heavily used in various projects as an interface to Z3 and CVC4
solvers.
Uber released a library for parsing and analysis of Vertica, Hive, and Presto
SQL queries.
Wire released the backend service to
their commercial offering.
The Advanced Telematic Systems group in Berlin released a Quickcheck family
library for doing for property testing of models about state
machines. Bose
also released Smudge a tool for doing development and analysis of large state
machines for hardware testing.
Florian Knupfer released a new high-performance HTML combinator
library for templating.
Galois continued with HalVM unikernel development this year, and
several HalVM Docker
Imagaes
were published allowing a very convenient way to write and test code against
HalVM.
Facebook released a Haskell string parsing library
duckling which parses human input into a
restricted set of semantically tagged data. Facebook also prototyped a technique
for hot-swapping Haskell code at runtime
using clever GHC API trickery.
Chris Done released vado a browser engine
written in Haskell.
Jonas Carpay released apecs an entity
component system for game development in Haskell.
Csaba Hruska continued work on a GRIN
compiler and code generator, an
alternative core language for compiling lazy functional languages based on the
work by Urban Boquist.
Dima Szamozvancev released mezzo a library and
embedded domain-specific language for music description that can enforce rules
of compositionality of music statically and prevent bad music from being
unleashed on the world.
Conal Elliot and John Wiegley advanced a novel set of ideas on Compiling with
Categories which allows the bidirectional
representation of Haskell functions as categorical structures. Although
currently implemented as a GHC typechecker extension it is a promising research
area.
Harry Clarke published a paper on Generics Layout-Preserving
Refactoring
using a reprinter. i.e. a tool which
takes a syntax tree, the original source file, and produces an updated source
file which preserves secondary notation. This can be used to build tools like
haskell-tools for arbitrary languages and write complicated refactoring suites.
Joachim Breitner contributed a prolific amount of open source projects including
a new technique (ghc-proofs) for proving the equivalence of Haskell programs
using a GHC plugin, (veggies) a verified simple LLVM code generator for GHC, and
(inspection-testing) new technique for verifying properties of Core.
The ghc-proofs plugin allows us
to embed equation and relations into our Haskell module and potentially allow
GHC to prove them correct by doing symbolic evaluation and evaluating them as
far as possible and checking the equational relations of the equations sides.
Right now it works for contrived and simple examples, but is quite a interesting
approach that may yield further fruit.
{-# OPTIONS_GHC -O -fplugin GHC.Proof.Plugin #-}
module Simple where
import GHC.Proof
import Data.Maybe
my_proof1 = (\f x -> isNothing (fmap f x))
=== (\f x -> isNothing x)
$ ghc Simple.hs
[1 of 1] Compiling Simple ( Simple.hs, Simple.o )
GHC.Proof: Proving my_proof1 …
GHC.Proof proved 1 equalities
Haddock
Haddock is creaking at the seams. Most large Haskell projects (GHC, Stack, Agda,
Cabal, Idris, etc) no longer use it for documentation. The codebase is dated and
long standing issues like dealing with reexported modules are still open.
There is a large vacuum for a better solution to emerge and compatibility with
RestructuredText would allow easy migration of existing documentation.
Databases
This year saw two new approaches to Haskell database integration:
- Selda - A library interacting with relational databases inspired by LINQ and Opaleye.
- Squeal - A deep embedding of PostgreSQL in Haskell using generics-sop.
This still remains an area in the ecosystem where many solutions end up
needlessly falling back to TemplateHaskell to generate data structures and
typically tend to be a pain point. Both these libraries use generics and
advanced type system features to statically enforce well-formed SQL query
construction.
Data Science & Numerical Computing
Chris Doran presented a concise interpretation of Geometric Algebra, a
generalization of linear algebra in terms of graded algebra written in
Haskell. A talk on this
project was presented at the Haskell
Exchange.
At ZuriHac several people expressed interest in forming a Data
Haskell organization to work on advancing the
state of Haskell libraries for data science and macing learning. There is
considerable interest of the constructing the right abstractions for a
well-typed dataframe library.
Tom Nielson is furiously working on a
suite of of projects, mostly
related to data science, machine learning and statistics in Haskell.
The Accelerate project has been seeing a lot of
development work recently and now has support for latest versions of LLVM and
CUDA through llvm-hs project.
A group at the University of Gothenburg released
TypedFlow a statically typed
higher-order frontend to TensorFlow.
Eta
Typelead continued working on a Java Virtual Machine backend to GHC called
Eta. The project has seen considerable amount of
person-hours invested in compatibility and the firm has raised capital to pursue
the project as a commercial venture.
The project does seem to be diverging from the Haskell mainline language in both
syntax and semantics which raises some interesting questions about viability.
Although the Haskell language standard seems to be stuck and not moving forward,
so there's an interesting conundrum faced by those trying to build on top of GHC
in the long run.
WebAssembly
WebAssembly has been released and is supported by all major browsers as of
December 2017. This is been piquing the interest of quite a few Haskellers
(including myself) who have been looking for a saner way to interact with
browsers than the ugly hack of ejecting hundreds of thousands of lines of
generated Javascript source. WebAssembly in either standalone projects or as a
target from GHC's intermediate form STG to the browser offers an interesting
path forward.
Industry
An incomplete list of non-consulting companies who are actively using Haskell or
have published Haskell open sources libraries follows:
+----------------------+
| Product Companies |
+======================+
| Standard Chartered |
+----------------------+
| Galois |
+----------------------+
| Intel |
+----------------------+
| Target |
+----------------------+
| Uber |
+----------------------+
| Vente Privee |
+----------------------+
| FrontRow |
+----------------------+
| NStack |
+----------------------+
| Morgan Stanley |
+----------------------+
| Takt |
+----------------------+
| Fugue |
+----------------------+
| Habito |
+----------------------+
| Asahi Net |
+----------------------+
| Sentenai |
+----------------------+
| IOHK |
+----------------------+
| Awake Networks |
+----------------------+
| Facebook |
+----------------------+
| Adjoint |
+----------------------+
| DigitalAsset |
+----------------------+
| AlphaSheets |
+----------------------+
| Channable |
+----------------------+
| SlamData |
+----------------------+
| Wire |
+----------------------+
| JP Morgan |
+----------------------+
| Bose |
+----------------------+
There are four consulting companies also supporting Haskell in industry.
+----------------------+
| Consultancies |
+======================+
| Well-Typed |
+----------------------+
| Tweag |
+----------------------+
| FP Complete |
+----------------------+
| Obsidian Systems |
+----------------------+
Conferences
Both the Haskell
Exchange
and ZuriHac conference had record attendance this year. Haskell was
well-represented in many interdisciplinary conferences include StrangeLoop,
CurryOn, LambdaDays, LambdaConf, and YOW.
Most of the videos are freely available online.
In 2018 I have plans to visit Zurich, Sydney, Budapest, Amsterdam, San
Francisco, Paris, Copenhagen and Tokyo this year. Hopefully I get to meet more
Haskellers and share some ideas about the future of our beautiful language.
Until then, Merry Christmas and Happy New Haskell Year.