Making the ReScript type system work for you (Part I)
The first step to getting the type checker to work for you is to understand how it works, what work it can and can't do.
The first step to getting the type checker to work for you is to understand how it works, what work it can and can't do. The compiler will take your source code and run an algorithm on it. This algorithm will either find a type error, or guarantee your program is type safe. We call this a type checking algorithm.
🤓 This chapter is not a 1:1 replica of how the ReScript type-checker works, because if it was so it would have to include every single detail in the type-checker. And if it included all that information, then this chapter would have to be the type-checker itself!
Instead this chapter will help you build useful intuitions to understand the work the type-checker does.
Type checking is a broad and deep subject, with plenty of theory to dig into. Here we will take a much lighter approach where we will explore many examples, increasing in complexity, to build a solid intuition of how this part of ReScript works.
In the process we will define small rules to guide us.
By the end, you will be able to look at an arbitrary piece of ReScript and know exactly what's wrong without even running the compiler.
Typing by Example
The core idea behind ReScript's type safety is that every expression always has a single type. Not zero. Not two. Exactly one.
Sometimes the compiler figures out this type by itself. We call this type inference. A large part of this chapter is about understanding how the type inference process works.
Some other times we tell the compiler exactly what the type should be. The compiler can then agree with us and say nothing. Or it can disagree with us and give us a type error.
But there is always exactly one type.
📏 Rule of Principality
In ReScript, every expression has a single type. We call it the Principal Type. This is always the most generic type for the expression, but it can be made more specific using type signatures.
The smallest example we can come up with that shows this is a constant expression:
2112
When ReScript reads this code, it will know that is an integer constant. All constants have predefined types in the compiler, so it can just pick the right one and move on. In this case, it is int
. So we get:
2112 : int
That was quick.
Typing Variables
Our next example is typing a variable binding.
let year = 2022
Intuitively we can say that year
will be of type int
, because we know that 2022
is of type int
and there's nothing else going on, but how does the compiler arrive at the same conclusion?
The first step the compiler takes is to assign a new type variable to every expression. But what exactly is a type variable?
A Type Variable is a sort of hole, a placeholder for a type, that we don't know yet. Since every expression has a type, the compiler automatically puts new type variables everywhere because every expression needs a type, and then later on it will try to make sure all the type variables actually have a type in them.
🤓 New type variables sometimes are called fresh type variables.
In this case, our expression annotated with type variables looks like this:
let year : 'a = (2022 : 'b)
We know that 2022
is a constant integer and has type int
. This limits 'b
so it must be equal to int
. We say that there is a constraint on 'b
. And since we are trying to say that year
is bound to 2022
, then we also know that 'a
should be equal to 'b
. This leaves us with the following type constraints.
'b = int
'a = 'b
It is useful to think of these constraints as a system of equations, like the ones taught in high-school math, to get an intuition of how you could solve them by hand. From these constraints, we can tell that if 'a
is equals to 'b
, and if 'b
is equals to int
, then 'a
is equals to int
.
let year : int = 2022
There we go: the Principal Type of year
is int
.
Of course it would be very inefficient if this had to be done every time we use our year
variable. The compiler will save this result in the Environment, a little table that keeps track of what symbols have what types.
| Symbol | Type |
|--------|------|
| year | int |
That way we can quickly look it up whenever we need it later on.
📏 Rule of Environment Order
Every time something is typed correctly it will be added to the environment. The Environment is built from the top of the module, so that things defined below have access to things defined above.
Let's summarize what we've learned so far. The process of finding a type from these constraints is called Type Inference, and it takes the following steps:
Annotate the code with new type variables
Collect all the constraints
Solve the constraints to find the Principal Type
🤓 The 3 papers outlining this work are listed here:
The Principal Type-scheme of an Object in Combinatory Logic
A Theory of Type Polymorphism in Programming
Principal type-schemes for functional programs
Beware that the writing style is very dry. Even after writing this book I still find these papers super cryptic!
Type Inference in ReScript uses an algorithm discovered and proved correct by Roger Hindley, Robin Milner, and Luis Damas, between 1969 and 1982. This algorithm is the current Gold Standard for type inference, because:
it never infer the wrong types,
almost always can do it without help from you as a programmer, and
it runs in linear time (it is fast!)
With me so far? Great!
Next Steps
This was all for today, but in the next parts of this series we'll continue to explore by example how the type system works, building up on what we did today to learn how to type:
Functions
Complex Expressions
Complex Data (like records and variants)
How Generics Work
and what are the limits of Type Inference in ReScript!
By the end you should have a correct and intuitive understanding of how the entire ReScript type system can work for you, so you can focus on shipping faster, better code!
If you liked this and would like to get notified as soon as they're out, subscribe :)